Golang学习笔记（二）——计算字符串长度 len()和RuneCountInString()

2020-01-17

字数统计: 348 | 阅读时长≈ 1 分钟

作者: 杰克小麻雀
原文链接: https://blog.csdn.net/yushuaigee/article/details/103446306

len()

该函数是内建函数，用来获取字符串的 ASCII 字符个数或字节长度。

由于 Go 语言的字符串都以 UTF-8 格式保存，每个中文占用 3 个字节，因此使用 len() 获得两个中文文字对应的 6 个字节。

package main

import "fmt"

func main() {
	str1 := "hello world"
	fmt.Println(len(str1)) //11

	str2 := "你好"
	fmt.Println(len(str2)) //6
}

utf8.RuneCountInString()

该函数由 UTF-8 包提供，用来统计 Uncode 字符数量。

package main

import (
	"fmt"
	"unicode/utf8"
)

func main() {
	str1 := "hello world"
	fmt.Println(utf8.RuneCountInString(str1)) //11

	str2 := "你好"
	fmt.Println(utf8.RuneCountInString(str2)) //2
}

ASCII 和 Unicode 的区别

在遍历一个字符串时，要注意 ascii 和 Unicode 的区别。

用普通的 for 循环遍历时，取到的是对应的ascii码

package main

import (
	"fmt"
)

func main() {
	str := "hello 世界"
	for i := 0; i &lt; len(str); i++ {
		fmt.Printf("ascii: %c\n", str[i])
	}
}

输出：

12
ascii: h
ascii: e
ascii: l
ascii: l
ascii: o
ascii:
ascii: ä
ascii: ¸
ascii:
ascii: ç
ascii:
ascii:

用 for … range 循环遍历时，取到的是对应的Unicode码

package main

import (
	"fmt"
	"unicode/utf8"
)

func main() {
    str := "hello 世界"
	fmt.Println(utf8.RuneCountInString(str))
	for _, s := range str {
	    fmt.Printf("Unicode: %c\n", s)
	}
}

输出：

8
Unicode: h
Unicode: e
Unicode: l
Unicode: l
Unicode: o
Unicode:
Unicode: 世
Unicode: 界

版权声明： 本博客所有文章除特别声明外，著作权归作者所有。转载请注明出处！