Day 13 - String format, Rune, UTF-8
String Formatting in Go
Formatting strings is a fundamental concept in Go programming.
The fmt package provides several functions to format, print, and return strings in a variety of ways.
It is commonly used for debugging, logging, generating dynamic text, and building structured output for APIs.
Using fmt.Sprintf()
Sprintf() formats and returns a string instead of printing it to the console.
It takes a format string and a list of arguments, then replaces format specifiers (%d, %s, etc.) with actual values.
Example:
package main
import "fmt"
func main() {
x, y, z := 10, 20, 30
result := fmt.Sprintf("Sum of %d and %d is %d", x, y, z)
fmt.Println(result)
}
Output:
Sum of 10 and 20 is 30
Common Format Specifiers
| Specifier | Type | Example |
%s | String | "Alice" |
%d | Integer (base 10) | 30 |
%f | Float (default precision) | 3.141593 |
%.2f | Float (2 decimal places) | 3.14 |
%t | Boolean | true |
%v | Any value (default format) | Alice, 30, true |
%#v | Go-syntax representation | "Alice", 30 |
%T | Type of the value | string, int |
%% | Literal percent sign | % |
You can mix multiple specifiers in one line to build dynamic output.
Formatting Complex Data Types
Goβs fmt package can also handle structs, arrays, and maps elegantly.
Example:
package main
import "fmt"
type Person struct {
Name string
Email string
Age int
}
func main() {
person := Person{Name: "John", Email: "johndoe@gmail.com", Age: 30}
fmt.Printf("Person: %v\n", person)
fmt.Printf("Detailed: %+v\n", person)
fmt.Printf("Go's version: %#v\n", person)
fmt.Printf("Only Name: %s\n", person.Name)
}
Output:
Person: {John johndoe@gmail.com 30}
Detailed: {Name:John Email:johndoe@gmail.com Age:30}
Go's version: main.Person{Name:"John", Email:"johndoe@gmail.com", Age:30}
Only Name: John
Explanation:
%vβ prints field values only.%+vβ includes field names in the output.%#vβ gives the Go-syntax representation (useful for debugging or recreating objects).
Alignment, Width, and Padding
You can control the width, alignment, and padding of formatted strings for better display in tables, reports, or logs.
| Specifier | Description |
%10s | Right-align in a 10-character width |
%-10s | Left-align in a 10-character width |
%010d | Pad an integer with zeros (width = 10) |
Example:
package main
import "fmt"
func main() {
fmt.Printf("%10s\n", "Go")
fmt.Printf("%-10s!\n", "Go")
fmt.Printf("%010d\n", 42)
}
Output:
Go
Go !
0000000042
Formatted Output Functions Overview
| Function | Description | Prints to |
fmt.Printf() | Prints formatted string | Standard Output |
fmt.Sprintf() | Returns formatted string | Variable |
fmt.Fprintf() | Writes formatted string | Any io.Writer (file, buffer, etc.) |
Example using Fprintf():
package main
import (
"fmt"
"os"
)
func main() {
name := "Alice"
age := 25
fmt.Fprintf(os.Stdout, "%s is %d years old.\n", name, age)
}
Output:
Alice is 25 years old.
Custom Formatting with the Stringer Interface
Go allows you to define custom string representations for your own types using the Stringer interface from the fmt package.
This gives you control over how your struct or type should appear when printed using functions like fmt.Println() or fmt.Printf("%v", value).
The fmt.Stringer Interface
The interface is very simple β it only requires one method:
type Stringer interface {
String() string
}
Any type that implements this method can define how it should be converted to a string.
Example: Custom String Representation
package main
import (
"fmt"
"time"
)
type Event struct {
Name string
Location string
StartTime time.Time
}
func (e Event) String() string {
return fmt.Sprintf("Event: %s | π %s | π %s",
e.Name,
e.Location,
e.StartTime.Format("02-Jan-2006 03:04 PM"),
)
}
func main() {
event := Event{
Name: "Go Conference",
Location: "Bangalore",
StartTime: time.Date(2025, 11, 3, 10, 30, 0, 0, time.Local),
}
// Automatically uses the custom String() method
fmt.Println(event)
// You can also use it with Printf
fmt.Printf("Event Details: %v\n", event)
}
Output
Event: Go Conference | π Bangalore | π 03-Nov-2025 10:30 AM
Event Details: Event: Go Conference | π Bangalore | π 03-Nov-2025 10:30 AM
π Why Use Stringer?
- Makes debugging easier β custom readable output for structs
- Enhances logging clarity
- Useful in JSON/log conversion before encoding
- Provides consistent formatting across your program
Rune in Go
In Go, a rune is an alias for the type int32.
It represents a Unicode code point β meaning a single character (letter, emoji, symbol, etc.) in the Unicode standard.
type rune = int32
Every character, no matter what language or script it belongs to, has a unique code point defined by Unicode.
Go uses rune to handle these characters safely and consistently β especially for multilingual and emoji text.
Why Rune Was Added in Go?
Go was designed with Unicode awareness from the start.
Earlier languages (like C, JavaScript, etc.) treated a "character" as a single byte (char), which works only for ASCII (English) characters.
But Unicode characters (like Γ±, δΈ, or π) take multiple bytes in UTF-8 encoding.
To handle this properly, Go introduced rune β which represents a whole Unicode character, not just a byte.
Example: Understanding Runes vs Bytes
package main
import (
"fmt"
)
func main() {
word := "Goπ"
fmt.Println("String:", word)
fmt.Println("Length in bytes:", len(word))
fmt.Println("Runes in string:")
for index, r := range word {
fmt.Printf("Index: %d, Rune: %c, Unicode: %U\n", index, r, r)
}
}
π§Ύ Output:
String: Goπ
Length in bytes: 6
Runes in string:
Index: 0, Rune: G, Unicode: U+0047
Index: 1, Rune: o, Unicode: U+006F
Index: 2, Rune: π, Unicode: U+1F60A
π Notice:
- The byte length is 6 (because π takes 4 bytes in UTF-8)
- The rune count is 3 β each representing a full character
Converting Between Bytes and Runes
package main
import (
"fmt"
)
func main() {
text := "Hiπ"
// Convert string to rune slice
runes := []rune(text)
fmt.Println("Runes:", runes)
fmt.Println("Number of runes:", len(runes))
// Convert rune slice back to string
newText := string(runes)
fmt.Println("Back to string:", newText)
}
Output:
Runes: [72 105 127757]
Number of runes: 3
Back to string: Hiπ
How Other Languages Handle Characters
| Language | Type | Description |
| C | char (1 byte) | Only ASCII; multi-byte UTF-8 handled manually |
| Python | str | Fully Unicode-aware (internally uses UTF-8/UTF-16) |
| Java | char (16 bits) | Uses UTF-16, cannot directly handle all Unicode code points (needs surrogate pairs) |
| JavaScript | string | UTF-16 encoded; emojis and complex characters require special handling (codePointAt) |
| Go | rune (int32) | Native Unicode support with clear distinction between bytes ([]byte) and runes ([]rune) |
Quick Summary
| Concept | Type | Represents | Example |
byte | uint8 | Raw byte (part of UTF-8 encoding) | 'A' β 65 |
rune | int32 | Unicode code point | 'π' β 128522 |
string | sequence of bytes | UTF-8 encoded text | "Hello π" |
Example: Counting Runes in a String
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
text := "γγγ«γ‘γ―" // Japanese for βHelloβ
fmt.Println("Bytes:", len(text))
fmt.Println("Runes:", utf8.RuneCountInString(text))
}
Output:
Bytes: 15
Runes: 5
Each Japanese character takes 3 bytes, but Go counts them properly as 5 runes.
Checking Rune Properties
The unicode package provides helper functions to test what kind of character a rune represents.
package main
import (
"fmt"
"unicode"
)
func main() {
runes := []rune{'A', 'a', '1', 'π', 'δΈ', ' '}
for _, r := range runes {
fmt.Printf("Rune: %c\n", r)
fmt.Printf(" IsLetter: %t\n", unicode.IsLetter(r))
fmt.Printf(" IsDigit: %t\n", unicode.IsDigit(r))
fmt.Printf(" IsSpace: %t\n", unicode.IsSpace(r))
fmt.Printf(" IsUpper: %t\n", unicode.IsUpper(r))
fmt.Printf(" IsLower: %t\n\n", unicode.IsLower(r))
}
}
Output:
Rune: A
IsLetter: true
IsDigit: false
IsSpace: false
IsUpper: true
IsLower: false
Rune: a
IsLetter: true
IsDigit: false
IsSpace: false
IsUpper: false
IsLower: true
Rune: 1
IsLetter: false
IsDigit: true
IsSpace: false
IsUpper: false
IsLower: false
Rune: π
IsLetter: false
IsDigit: false
IsSpace: false
IsUpper: false
IsLower: false
Rune: δΈ
IsLetter: true
IsDigit: false
IsSpace: false
IsUpper: false
IsLower: false
Rune:
IsLetter: false
IsDigit: false
IsSpace: true
IsUpper: false
IsLower: false
Goβs Unicode functions work for any language or symbol set, not just ASCII!
UTF in Go
To handle text properly, Go provides full Unicode and UTF-8 support right in its core language β no extra libraries needed. Letβs explore what UTF means, how Go represents it, and how you can work with it safely.
What is UTF?
UTF stands for Unicode Transformation Format. Itβs a family of encodings (UTF-8, UTF-16, UTF-32) used to represent Unicode characters β i.e., all characters from all writing systems, emojis, symbols, etc.
| Encoding | Size | Description |
| UTF-8 | 1β4 bytes | Variable-length, most common (default in Go) |
| UTF-16 | 2 or 4 bytes | Used in Windows, Java |
| UTF-32 | 4 bytes | Fixed-length, but inefficient in size |
Go uses UTF-8 internally for strings, and UTF-32 for runes (each rune = one Unicode code point).
Goβs Model of Text
In Go:
String = immutable sequence of bytes This dual representation allows Go to handle:
ASCII text efficiently
- Multilingual / emoji text accurately
Example: How UTF-8 Encodes Strings in Go
package main
import (
"fmt"
)
func main() {
s := "AδΈπ"
fmt.Println("String:", s)
fmt.Println("Bytes:", []byte(s))
fmt.Println("Runes:", []rune(s))
}
Output:
String: AδΈπ
Bytes: [65 228 184 173 240 159 152 138]
Runes: [65 20013 128522]
Explanation:
Aβ 1 byte (ASCII)δΈβ 3 bytes (Chinese character)πβ 4 bytes (emoji)- Each rune is stored as an int32 code point
UTF-8 Utilities in Go
Goβs unicode/utf8 package provides essential tools to decode and inspect UTF-8 strings.
Example: Counting Runes
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
str := "Hello δΈηπ"
fmt.Println("Bytes:", len(str))
fmt.Println("Runes:", utf8.RuneCountInString(str))
}
Output:
Bytes: 15
Runes: 9
Example: Decoding UTF-8 Manually
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
str := "δΈη"
for i, w := 0, 0; i < len(str); i += w {
r, width := utf8.DecodeRuneInString(str[i:])
fmt.Printf("%c starts at byte %d (width: %d)\n", r, i, width)
w = width
}
}
Output:
δΈ starts at byte 0 (width: 3)
η starts at byte 3 (width: 3)
DecodeRuneInString helps when youβre working at the byte level (e.g., parsing text streams or files).
Converting Between Strings, Bytes, and Runes
1οΈ. String β Runes
s := "Goπ"
runes := []rune(s)
fmt.Println(runes) // [71 111 128522]
2οΈ. Runes β String
runes := []rune{71, 111, 128522}
s := string(runes)
fmt.Println(s) // Goπ
3οΈ. String β Bytes
s := "Goπ"
b := []byte(s)
fmt.Println(b) // [71 111 240 159 152 138]
4οΈ. Bytes β String
b := []byte{71, 111, 240, 159, 152, 138}
s := string(b)
fmt.Println(s) // Goπ
UTF-8 Validation
Go makes it easy to check if a string or byte slice is valid UTF-8.
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
data := []byte{0xff, 0xfe, 0xfd}
fmt.Println("Valid UTF-8?", utf8.Valid(data))
valid := []byte("Hello δΈη")
fmt.Println("Valid UTF-8?", utf8.Valid(valid))
}
Output:
Valid UTF-8? false
Valid UTF-8? true
Why UTF-8 is Default in Go
- UTF-8 is backward-compatible with ASCII (English text is unchanged).
- Compact: uses fewer bytes than UTF-32 for most text.
- Supported natively in Linux, macOS, web, databases, and network protocols.
- Goβs philosophy: simple, interoperable, and efficient β UTF-8 fits perfectly.
Working with UTF-16 or UTF-32
While Go uses UTF-8 by default, you can convert to/from UTF-16 or UTF-32 when working with other systems (e.g., Windows APIs, Java).
Example using unicode/utf16:
package main
import (
"fmt"
"unicode/utf16"
)
func main() {
runes := []rune("Hello π")
utf16Encoded := utf16.Encode(runes)
fmt.Println("UTF-16 Encoded:", utf16Encoded)
decoded := utf16.Decode(utf16Encoded)
fmt.Println("Decoded:", string(decoded))
}
Output:
UTF-16 Encoded: [72 101 108 108 111 32 55357 56842]
Decoded: Hello π
Key Packages Summary
| Package | Purpose |
unicode | Character classification & case conversion |
unicode/utf8 | Encode/decode UTF-8 |
unicode/utf16 | Encode/decode UTF-16 |
strings | UTF-8 aware string operations |
bytes | Efficient byte-level manipulation |
#go #golang #development #backend #backendproject #microservices
#software #startups #efficientprogramming #softwaredevelopment