这篇文章上次修改于 306 天前,可能其部分内容已经发生变化,如有疑问可询问作者。

在调用 toStyledStringjson 对象序列化成字符串时会自动将中文转成 Unicode,比如:\u5e73\u548c\u671b

解决办法如下:

  1. src\lib_json\json_writer.cpp 中 找到 valueToQuotedStringN 函数,这是将值输出为 json 格式的函数代码

    static JSONCPP_STRING valueToQuotedStringN(const char* value, unsigned length) {
        if (value == NULL)
        return "";
    
        if (!isAnyCharRequiredQuoting(value, length))
        return JSONCPP_STRING("\"") + value + "\"";
        // We have to walk value and escape any special characters.
        // Appending to JSONCPP_STRING is not efficient, but this should be rare.
        // (Note: forward slashes are *not* rare, but I am not escaping them.)
        JSONCPP_STRING::size_type maxsize = length * 2 + 3; // allescaped+quotes+NULL
        JSONCPP_STRING result;
        result.reserve(maxsize); // to avoid lots of mallocs
        result += "\"";
        char const* end = value + length;
        for (const char* c = value; c != end; ++c) {
        switch (*c) {
        case '\"':
            result += "\\\"";
            break;
        case '\\':
            result += "\\\\";
            break;
        case '\b':
            result += "\\b";
            break;
        case '\f':
            result += "\\f";
            break;
        case '\n':
            result += "\\n";
            break;
        case '\r':
            result += "\\r";
            break;
        case '\t':
            result += "\\t";
            break;
        // case '/':
        // Even though \/ is considered a legal escape in JSON, a bare
        // slash is also legal, so I see no reason to escape it.
        // (I hope I am not misunderstanding something.)
        // blep notes: actually escaping \/ may be useful in javascript to avoid </
        // sequence.
        // Should add a flag to allow this compatibility mode and prevent this
        // sequence from occurring.
        default: {
            unsigned int cp = utf8ToCodepoint(c, end);
            // don't escape non-control characters
            // (short escape sequence are applied above)
            if (cp < 0x80 && cp >= 0x20)
            result += static_cast<char>(cp);
            else if (cp < 0x10000) { // codepoint is in Basic Multilingual Plane
            result += "\\u";
            result += toHex16Bit(cp);
            } else { // codepoint is not in Basic Multilingual Plane
                    // convert to surrogate pair first
            cp -= 0x10000;
            result += "\\u";
            result += toHex16Bit((cp >> 10) + 0xD800);
            result += "\\u";
            result += toHex16Bit((cp & 0x3FF) + 0xDC00);
            }
        } break;
        }
        }
        result += "\"";
        return result;
    }
  2. 其中,default 就是将中文转为 unicode 格式字符串的代码,可以看到有一段代码是将中文转为 16 进制字符,为了正常输出中文,那么就需要源码这一部分改了,将

    default: {
        unsigned int cp = utf8ToCodepoint(c, end);
        // don't escape non-control characters
        // (short escape sequence are applied above)
        if (cp < 0x80 && cp >= 0x20)
            result += static_cast<char>(cp);
        else if (cp < 0x10000) { // codepoint is in Basic Multilingual Plane
            result += "\\u";
            result += toHex16Bit(cp);
        } else { // codepoint is not in Basic Multilingual Plane
                // convert to surrogate pair first
            cp -= 0x10000;
            result += "\\u";
            result += toHex16Bit((cp >> 10) + 0xD800);
            result += "\\u";
            result += toHex16Bit((cp & 0x3FF) + 0xDC00);
        }
        } break;

    改为:

    default: 
        result += *c;
        break;

    直接将中文字符输出。

  3. 参考资料