Skip to main content

Command Palette

Search for a command to run...

Day 13 - String format, Rune, UTF-8

Published
β€’10 min read

String Formatting in Go

Formatting strings is a fundamental concept in Go programming. The fmt package provides several functions to format, print, and return strings in a variety of ways. It is commonly used for debugging, logging, generating dynamic text, and building structured output for APIs.


Using fmt.Sprintf()

Sprintf() formats and returns a string instead of printing it to the console. It takes a format string and a list of arguments, then replaces format specifiers (%d, %s, etc.) with actual values.

Example:

package main
import "fmt"

func main() {
    x, y, z := 10, 20, 30
    result := fmt.Sprintf("Sum of %d and %d is %d", x, y, z)
    fmt.Println(result)
}

Output:

Sum of 10 and 20 is 30

Common Format Specifiers

SpecifierTypeExample
%sString"Alice"
%dInteger (base 10)30
%fFloat (default precision)3.141593
%.2fFloat (2 decimal places)3.14
%tBooleantrue
%vAny value (default format)Alice, 30, true
%#vGo-syntax representation"Alice", 30
%TType of the valuestring, int
%%Literal percent sign%

You can mix multiple specifiers in one line to build dynamic output.


Formatting Complex Data Types

Go’s fmt package can also handle structs, arrays, and maps elegantly.

Example:

package main
import "fmt"

type Person struct {
    Name  string
    Email string
    Age   int
}

func main() {
    person := Person{Name: "John", Email: "johndoe@gmail.com", Age: 30}

    fmt.Printf("Person: %v\n", person)
    fmt.Printf("Detailed: %+v\n", person)
    fmt.Printf("Go's version: %#v\n", person)
    fmt.Printf("Only Name: %s\n", person.Name)
}

Output:

Person: {John johndoe@gmail.com 30}
Detailed: {Name:John Email:johndoe@gmail.com Age:30}
Go's version: main.Person{Name:"John", Email:"johndoe@gmail.com", Age:30}
Only Name: John

Explanation:

  • %v β€” prints field values only.
  • %+v β€” includes field names in the output.
  • %#v β€” gives the Go-syntax representation (useful for debugging or recreating objects).

Alignment, Width, and Padding

You can control the width, alignment, and padding of formatted strings for better display in tables, reports, or logs.

SpecifierDescription
%10sRight-align in a 10-character width
%-10sLeft-align in a 10-character width
%010dPad an integer with zeros (width = 10)

Example:

package main
import "fmt"

func main() {
    fmt.Printf("%10s\n", "Go")
    fmt.Printf("%-10s!\n", "Go")
    fmt.Printf("%010d\n", 42)
}

Output:

        Go
Go        !
0000000042

Formatted Output Functions Overview

FunctionDescriptionPrints to
fmt.Printf()Prints formatted stringStandard Output
fmt.Sprintf()Returns formatted stringVariable
fmt.Fprintf()Writes formatted stringAny io.Writer (file, buffer, etc.)

Example using Fprintf():

package main
import (
    "fmt"
    "os"
)

func main() {
    name := "Alice"
    age := 25
    fmt.Fprintf(os.Stdout, "%s is %d years old.\n", name, age)
}

Output:

Alice is 25 years old.

Custom Formatting with the Stringer Interface

Go allows you to define custom string representations for your own types using the Stringer interface from the fmt package. This gives you control over how your struct or type should appear when printed using functions like fmt.Println() or fmt.Printf("%v", value).

The fmt.Stringer Interface

The interface is very simple β€” it only requires one method:

type Stringer interface {
    String() string
}

Any type that implements this method can define how it should be converted to a string.

Example: Custom String Representation

package main

import (
    "fmt"
    "time"
)

type Event struct {
    Name      string
    Location  string
    StartTime time.Time
}

func (e Event) String() string {
    return fmt.Sprintf("Event: %s | πŸ“ %s | πŸ•’ %s",
        e.Name,
        e.Location,
        e.StartTime.Format("02-Jan-2006 03:04 PM"),
    )
}

func main() {
    event := Event{
        Name:      "Go Conference",
        Location:  "Bangalore",
        StartTime: time.Date(2025, 11, 3, 10, 30, 0, 0, time.Local),
    }

    // Automatically uses the custom String() method
    fmt.Println(event)

    // You can also use it with Printf
    fmt.Printf("Event Details: %v\n", event)
}

Output

Event: Go Conference | πŸ“ Bangalore | πŸ•’ 03-Nov-2025 10:30 AM
Event Details: Event: Go Conference | πŸ“ Bangalore | πŸ•’ 03-Nov-2025 10:30 AM

πŸ” Why Use Stringer?

  • Makes debugging easier β€” custom readable output for structs
  • Enhances logging clarity
  • Useful in JSON/log conversion before encoding
  • Provides consistent formatting across your program

Rune in Go

In Go, a rune is an alias for the type int32. It represents a Unicode code point β€” meaning a single character (letter, emoji, symbol, etc.) in the Unicode standard.

type rune = int32

Every character, no matter what language or script it belongs to, has a unique code point defined by Unicode. Go uses rune to handle these characters safely and consistently β€” especially for multilingual and emoji text.


Why Rune Was Added in Go?

Go was designed with Unicode awareness from the start. Earlier languages (like C, JavaScript, etc.) treated a "character" as a single byte (char), which works only for ASCII (English) characters.

But Unicode characters (like Γ±, δΈ­, or 😊) take multiple bytes in UTF-8 encoding. To handle this properly, Go introduced rune β€” which represents a whole Unicode character, not just a byte.


Example: Understanding Runes vs Bytes

package main

import (
    "fmt"
)

func main() {
    word := "Go😊"

    fmt.Println("String:", word)
    fmt.Println("Length in bytes:", len(word))
    fmt.Println("Runes in string:")

    for index, r := range word {
        fmt.Printf("Index: %d, Rune: %c, Unicode: %U\n", index, r, r)
    }
}

🧾 Output:

String: Go😊
Length in bytes: 6
Runes in string:
Index: 0, Rune: G, Unicode: U+0047
Index: 1, Rune: o, Unicode: U+006F
Index: 2, Rune: 😊, Unicode: U+1F60A

πŸ‘‰ Notice:

  • The byte length is 6 (because 😊 takes 4 bytes in UTF-8)
  • The rune count is 3 β€” each representing a full character

Converting Between Bytes and Runes

package main

import (
    "fmt"
)

func main() {
    text := "Hi🌍"

    // Convert string to rune slice
    runes := []rune(text)
    fmt.Println("Runes:", runes)
    fmt.Println("Number of runes:", len(runes))

    // Convert rune slice back to string
    newText := string(runes)
    fmt.Println("Back to string:", newText)
}

Output:

Runes: [72 105 127757]
Number of runes: 3
Back to string: Hi🌍

How Other Languages Handle Characters

LanguageTypeDescription
Cchar (1 byte)Only ASCII; multi-byte UTF-8 handled manually
PythonstrFully Unicode-aware (internally uses UTF-8/UTF-16)
Javachar (16 bits)Uses UTF-16, cannot directly handle all Unicode code points (needs surrogate pairs)
JavaScriptstringUTF-16 encoded; emojis and complex characters require special handling (codePointAt)
Gorune (int32)Native Unicode support with clear distinction between bytes ([]byte) and runes ([]rune)

Quick Summary

ConceptTypeRepresentsExample
byteuint8Raw byte (part of UTF-8 encoding)'A' β†’ 65
runeint32Unicode code point'😊' β†’ 128522
stringsequence of bytesUTF-8 encoded text"Hello 😊"

Example: Counting Runes in a String

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {
    text := "こんにけは" // Japanese for β€œHello”
    fmt.Println("Bytes:", len(text))
    fmt.Println("Runes:", utf8.RuneCountInString(text))
}

Output:

Bytes: 15
Runes: 5

Each Japanese character takes 3 bytes, but Go counts them properly as 5 runes.


Checking Rune Properties

The unicode package provides helper functions to test what kind of character a rune represents.

package main

import (
    "fmt"
    "unicode"
)

func main() {
    runes := []rune{'A', 'a', '1', '😊', '中', ' '}

    for _, r := range runes {
        fmt.Printf("Rune: %c\n", r)
        fmt.Printf("  IsLetter: %t\n", unicode.IsLetter(r))
        fmt.Printf("  IsDigit: %t\n", unicode.IsDigit(r))
        fmt.Printf("  IsSpace: %t\n", unicode.IsSpace(r))
        fmt.Printf("  IsUpper: %t\n", unicode.IsUpper(r))
        fmt.Printf("  IsLower: %t\n\n", unicode.IsLower(r))
    }
}

Output:

Rune: A
  IsLetter: true
  IsDigit: false
  IsSpace: false
  IsUpper: true
  IsLower: false

Rune: a
  IsLetter: true
  IsDigit: false
  IsSpace: false
  IsUpper: false
  IsLower: true

Rune: 1
  IsLetter: false
  IsDigit: true
  IsSpace: false
  IsUpper: false
  IsLower: false

Rune: 😊
  IsLetter: false
  IsDigit: false
  IsSpace: false
  IsUpper: false
  IsLower: false

Rune: δΈ­
  IsLetter: true
  IsDigit: false
  IsSpace: false
  IsUpper: false
  IsLower: false

Rune:  
  IsLetter: false
  IsDigit: false
  IsSpace: true
  IsUpper: false
  IsLower: false

Go’s Unicode functions work for any language or symbol set, not just ASCII!


UTF in Go

To handle text properly, Go provides full Unicode and UTF-8 support right in its core language β€” no extra libraries needed. Let’s explore what UTF means, how Go represents it, and how you can work with it safely.

What is UTF?

UTF stands for Unicode Transformation Format. It’s a family of encodings (UTF-8, UTF-16, UTF-32) used to represent Unicode characters β€” i.e., all characters from all writing systems, emojis, symbols, etc.

EncodingSizeDescription
UTF-81–4 bytesVariable-length, most common (default in Go)
UTF-162 or 4 bytesUsed in Windows, Java
UTF-324 bytesFixed-length, but inefficient in size

Go uses UTF-8 internally for strings, and UTF-32 for runes (each rune = one Unicode code point).


Go’s Model of Text

In Go:

  • String = immutable sequence of bytes This dual representation allows Go to handle:

  • ASCII text efficiently

  • Multilingual / emoji text accurately

Example: How UTF-8 Encodes Strings in Go

package main

import (
    "fmt"
)

func main() {
    s := "A中😊"

    fmt.Println("String:", s)
    fmt.Println("Bytes:", []byte(s))
    fmt.Println("Runes:", []rune(s))
}

Output:

String: A中😊
Bytes: [65 228 184 173 240 159 152 138]
Runes: [65 20013 128522]

Explanation:

  • A β†’ 1 byte (ASCII)
  • δΈ­ β†’ 3 bytes (Chinese character)
  • 😊 β†’ 4 bytes (emoji)
  • Each rune is stored as an int32 code point

UTF-8 Utilities in Go

Go’s unicode/utf8 package provides essential tools to decode and inspect UTF-8 strings.

Example: Counting Runes

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {
    str := "Hello δΈ–η•ŒπŸ˜Š"
    fmt.Println("Bytes:", len(str))
    fmt.Println("Runes:", utf8.RuneCountInString(str))
}

Output:

Bytes: 15
Runes: 9

Example: Decoding UTF-8 Manually

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {
    str := "δΈ–η•Œ"

    for i, w := 0, 0; i < len(str); i += w {
        r, width := utf8.DecodeRuneInString(str[i:])
        fmt.Printf("%c starts at byte %d (width: %d)\n", r, i, width)
        w = width
    }
}

Output:

δΈ– starts at byte 0 (width: 3)
η•Œ starts at byte 3 (width: 3)

DecodeRuneInString helps when you’re working at the byte level (e.g., parsing text streams or files).


Converting Between Strings, Bytes, and Runes

1️. String β†’ Runes

s := "Go😊"
runes := []rune(s)
fmt.Println(runes)  // [71 111 128522]

2️. Runes β†’ String

runes := []rune{71, 111, 128522}
s := string(runes)
fmt.Println(s) // Go😊

3️. String β†’ Bytes

s := "Go😊"
b := []byte(s)
fmt.Println(b) // [71 111 240 159 152 138]

4️. Bytes β†’ String

b := []byte{71, 111, 240, 159, 152, 138}
s := string(b)
fmt.Println(s) // Go😊

UTF-8 Validation

Go makes it easy to check if a string or byte slice is valid UTF-8.

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {
    data := []byte{0xff, 0xfe, 0xfd}
    fmt.Println("Valid UTF-8?", utf8.Valid(data))

    valid := []byte("Hello δΈ–η•Œ")
    fmt.Println("Valid UTF-8?", utf8.Valid(valid))
}

Output:

Valid UTF-8? false
Valid UTF-8? true

Why UTF-8 is Default in Go

  • UTF-8 is backward-compatible with ASCII (English text is unchanged).
  • Compact: uses fewer bytes than UTF-32 for most text.
  • Supported natively in Linux, macOS, web, databases, and network protocols.
  • Go’s philosophy: simple, interoperable, and efficient β€” UTF-8 fits perfectly.

Working with UTF-16 or UTF-32

While Go uses UTF-8 by default, you can convert to/from UTF-16 or UTF-32 when working with other systems (e.g., Windows APIs, Java).

Example using unicode/utf16:

package main

import (
    "fmt"
    "unicode/utf16"
)

func main() {
    runes := []rune("Hello 😊")
    utf16Encoded := utf16.Encode(runes)
    fmt.Println("UTF-16 Encoded:", utf16Encoded)

    decoded := utf16.Decode(utf16Encoded)
    fmt.Println("Decoded:", string(decoded))
}

Output:

UTF-16 Encoded: [72 101 108 108 111 32 55357 56842]
Decoded: Hello 😊

Key Packages Summary

PackagePurpose
unicodeCharacter classification & case conversion
unicode/utf8Encode/decode UTF-8
unicode/utf16Encode/decode UTF-16
stringsUTF-8 aware string operations
bytesEfficient byte-level manipulation

#go #golang #development #backend #backendproject #microservices

#software #startups #efficientprogramming #softwaredevelopment