diff --git a/BUILD.gn b/BUILD.gn index 0f62ddf..e1168a3 100644 --- a/BUILD.gn +++ b/BUILD.gn @@ -31,6 +31,7 @@ ohos_rust_shared_library("lib") { "default", "vec_array", "btree_object", + "c_adapter", ] } @@ -47,5 +48,6 @@ ohos_rust_unittest("rust_ylong_json_unit_test") { "--cfg=feature=\"default\"", "--cfg=feature=\"vec_array\"", "--cfg=feature=\"btree_object\"", + "--cfg=feature=\"c_adapter\"", ] } diff --git a/README.md b/README.md index 2c99743..7f6ff07 100644 --- a/README.md +++ b/README.md @@ -1,30 +1,33 @@ # ylong_json ## Introduction -The `ylong_json` module provides serialization of text or string in JSON syntax format and deserialization of corresponding generated instances. +`ylong_json` is a general `JSON` syntax parsing library that provides functions for converting `JSON` text to and from specific data structures. -### ylong_json in Openharmony +### ylong_json in OpenHarmony ![structure](./figures/ylong_json_oh_relate.png) -Here is the description of the key fields in the figure above: -- `ylong_json` : System component that provides json serialization and deserialization capabilities -- `serde` : Third-party library for efficient and generic serialization and deserialization of Rust data structures. +The following is a description of the key fields in the above figure: +- `Application Layer`: The application layer provides specific functions to users. +- `App`: Various applications need to use the functions of the system service layer. +- `System Service Layer`: System service layer, which provides system services to upper-layer applications. +- `system services`: Various system services require the use of `JSON` related functions. +- `ylong_json`: System component, providing common `JSON` serialization and deserialization capabilities to related components of the system service layer. +- `serde`: third-party library for efficient and versatile serialization and deserialization of `Rust` data structures. ### ylong_json Internal architecture diagram ![structure](./figures/ylong_json_inner_structure.png) -`ylong_json` is mainly divided into two modules, a module with a custom `JsonValue` structure type as the core and a module that ADAPTS to the third-party library `serde`. +`ylong_json` is mainly divided into three submodules: `JsonValue` submodule, `serde` submodule, and C-ffi submodule. -1. `JsonValue` is the internal custom structure type of `ylong_json`, and the serialization and deserialization function of `json` is built with this structure as the core. -- `JsonValue` : The core structure type, which stores the json content information, has 6 internal enum type variants. -- `LinkedList`, `Vec`, `BTreeMap` : Three ways of storing data inside `Array` and `Object`, selected by `features`. -- Serialization ability: Supports outputting a `JsonValue` instance as a compact strings or writing to the output stream. -- Deserialization ability: Supports parsing `json` text or `json` content in the input stream and generating a `JsonValue` instance. +The `JsonValue` submodule provides a basic data structure `JsonValue`. +`JsonValue` supports serializing itself into `JSON` text in either indented or compact format. Any syntactically correct `JSON` text can also be deserialized into a corresponding `JsonValue` data structure. +`JsonValue` supports addition, deletion, modification and query, and you can use the specified interface to change the data content in `JsonValue`. +`JsonValue` supports all data types in `JSON` syntax: `null`, `boolean`, `number`, `string`, `array`, `object`, and implements all its functions according to `ECMA-404`. +For `array` and `object` grammatical structures, `JsonValue` provides a variety of underlying data structures for different usage scenarios, for example, for `array` structures, it supports the underlying use of `Vec` or `LinkedList`, for `object` , supports the use of `Vec`, `LinkedList` or `Btree` as its underlying layer. +On different underlying data structures, `array` and `object` will reflect different creation and query performance, for example, `object` based on `Btree` data structure has higher performance in query, `LinkedList` or `LinkedList` or `Vec` has high performance in terms of creation. -2. `ylong_json` adapts to the third-party library `serde` -- `Serializer`: The auxiliary structure for serialization. -- `Deserializer`: The auxiliary structure for deserialization. -- Serialization ability: Supports for serializing a type instance that implements the `serde::Serialize` trait into `json` text content or writing the content to the output stream. -- Deserialization ability: If the `json` content has the type that implements `serde::Deserialize` trait, then that part of the `json` content can be deserialized into an instance of that type. +The `serde` submodule provides procedural macro functions based on the `Serialize` and `Deserialize` traits provided by the `serde` third-party library, which can support fast conversion of user structures and `JSON` text. +The advantage of `serde` compared to `JsonValue` is that it is easy to use. Users do not need to convert the `JSON` text to `JsonValue` and then extract the specified data from it to generate the `Rust` structure. They only need to set `Serialize' to the structure. ` and `Deserialize` process macro tags can be used to serialize the interface structure provided in `ylong_json` into `JSON` text, or convert the corresponding `JSON` text into a user structure. +The C-ffi module provides a C interface layer based on the `JsonValue` module, which facilitates users to use the C interface to call the functions of the `ylong_json` library. ## Directory ``` ylong_json @@ -72,40 +75,44 @@ external_deps = ["ylong_json:lib"] See [user_guide](./docs/user_guide.md) ## Performance test +The following tests are from [`nativejson-benchmark`](https://www.github.com/miloyip/nativejson-benchmark)。 + +The test environment information is as follows: ``` -1.Test environment -OS: Linux -Architecture: x86_64 -Byte Order: Little Endian -Model number: Intel(R) Xeon(R) Gold 6278C CPU @ 2.60GHz +OS: Ubuntu 7.3.-16ubuntu3 +Processor: Intel(R) Xeon(R) Gold 6278C CPU @ 2.60GHz CPU(s): 8 -MemTotal: 16G - -2.Test result -| Serialize | ylong_json | serde_json | ------------------------------------------------- -| null | 150 ns/iter | 175 ns/iter | -| boolean | 155 ns/iter | 178 ns/iter | -| number | 309 ns/iter | 291 ns/iter | -| string | 513 ns/iter | 413 ns/iter | -| array | 998 ns/iter | 1,075 ns/iter | -| object | 1,333 ns/iter | 1,348 ns/iter | -| example1 | 12,537 ns/iter | 12,288 ns/iter | -| example2 | 23,754 ns/iter | 21,936 ns/iter | -| example3 | 103,061 ns/iter | 97,247 ns/iter | -| example4 | 15,234 ns/iter | 17,895 ns/iter | - -| Deserialize | ylong_json | serde_json | --------------------------------------------------- -| null | 257 ns/iter | 399 ns/iter | -| boolean | 260 ns/iter | 400 ns/iter | -| number | 1,507 ns/iter | 989 ns/iter | -| string | 414 ns/iter | 610 ns/iter | -| array | 2,258 ns/iter | 2,148 ns/iter | -| object | 810 ns/iter | 1,386 ns/iter | -| example1 | 10,191 ns/iter | 10,227 ns/iter | -| example2 | 15,753 ns/iter | 18,022 ns/iter | -| example3 | 55,910 ns/iter | 59,717 ns/iter | -| example4 | 18,461 ns/iter | 12,471 ns/iter | +Memory:8.0 G ``` +Software versions tested: + +cJSON 1.7.16 + +Test Results: +``` +======= ylong-json ==== parse | stringify ==== +canada.json 200 MB/s 90 MB/s +citm_catalog.json 450 MB/s 300 MB/s +twitter.json 340 MB/s 520 MB/s + +======== cJSON ======== parse | stringify ==== +canada.json 55 MB/s 11 MB/s +citm_catalog.json 260 MB/s 170 MB/s +twitter.json 210 MB/s 210 MB/s +``` + +Description of test results: + +Three test files are provided in the `nativejson-benchmark` test. Among them, `canada.json` contains a large number of `number` structures, the various data types of `citm_catalog.json` are relatively average, and `twitter.json` exists Various `UTF-8` characters. +To ensure test fairness, `ylong_json` enables `list_object`, `list_array` and `ascii_only` feature. +The `list_object` and `list_array` features are mainly to ensure consistency with the `cJSON` data structure, and both are implemented using linked lists. +`ascii_only` feature is to ensure consistent processing logic for `UTF-8` characters, `cJSON` does not handle UTF-8 characters. + +The testing process is as follows: +- Read the content of the file into the memory, and get the content of the file `content`. +- Call the specified `JSON` library deserialization interface to generate the corresponding `JSON` structure `data`. +- Call the serialization interface of the `JSON` structure to generate the output content `result`. +- Using `content`, loop deserialization generates `JSON` structure 100 times, taking the smallest processing time `t1`. +- Using `data`, serialize and generate `JSON` text 100 times, taking the smallest processing time `t2`. +- Calculate the parsing speed, the deserialization time is the length of `content` divided by `t1`, and the serialization time is the length of the `JSON` text divided by `t2`. \ No newline at end of file diff --git a/README_zh.md b/README_zh.md index 3bf5e4a..4101085 100644 --- a/README_zh.md +++ b/README_zh.md @@ -1,29 +1,33 @@ # ylong_json ## 简介 -`ylong_json` 模块提供了 JSON 语法格式文本或字符串的序列化功能,以及对应生成实例的反序列化功能。 +`ylong_json` 是一个通用的 `JSON` 语法解析库,提供了 `JSON` 文本和特定数据结构之间的相互转换功能。 -### ylong_json 在 Openharmony 中的位置 +### ylong_json 在 OpenHarmony 中的位置 ![structure](./figures/ylong_json_oh_relate.png) 以下是对于上图关键字段的描述信息: -- `ylong_json`:提供 `json` 序列化与反序列化能力的系统组件 +- `Application Layer`:应用层,给用户提供具体功能。 +- `App`:各种应用,需要使用系统服务层的功能。 +- `System Service Layer`:系统服务层,给上层应用提供系统服务。 +- `system services`:各种系统服务,需要使用 `JSON` 相关的功能。 +- `ylong_json`:系统组件,给系统服务层的相关组件提供通用的 `JSON` 序列化与反序列化能力。 - `serde`:第三方库,用于高效、通用地序列化和反序列化 `Rust` 数据结构。 ### ylong_json 内部架构图 ![structure](./figures/ylong_json_inner_structure.png) -`ylong_json` 内部主要分为两个模块,以自定义 `JsonValue` 类型为核心的模块和适配第三方库 `serde` 的模块。 +`ylong_json` 主要分为三个子模块:`JsonValue` 子模块、`serde` 子模块、C-ffi 子模块。 -1. `JsonValue` 是 `ylong_json` 内部自定义的结构类型,以该结构为核心构建 `json` 的序列化与反序列化功能。 -- `JsonValue` :核心结构类型,存储 `json` 内容信息,共有 6 种枚举类型变体。 -- `LinkedList`, `Vec`, `BTreeMap`:`Array` 与 `Object` 内部数据存储的三种方式,通过 `features` 选择。 -- 序列化功能:支持将 `JsonValue` 实例输出为紧凑型字符串或写到输出流中。 -- 反序列化功能:支持解析 `json` 文本或输入流中的 `json` 内容并生成 `JsonValue` 实例。 +`JsonValue` 子模块提供了一种基础数据结构 `JsonValue`。 +`JsonValue` 支持以缩进型格式或紧凑型格式将自身序列化成 `JSON` 文本。任意语法正确的 `JSON` 文本也能被反序列化成一个对应的 `JsonValue` 数据结构。 +`JsonValue` 支持增删改查,可以使用指定接口变更 `JsonValue` 中的数据内容。 +`JsonValue` 支持 `JSON` 语法中全部的数据类型:`null`, `boolean`, `number`, `string`, `array`, `object`,且按照 `ECMA-404` 实现其全部功能。 +针对于 `array` 和 `object` 语法结构,`JsonValue` 提供了多种底层数据结构以针对不同使用场景,例如对于 `array` 结构,支持底层使用 `Vec` 或 `LinkedList`,对于 `object`,支持其底层使用 `Vec`, `LinkedList` 或 `Btree`。 +在不同的底层数据结构之上,`array` 和 `object` 会体现出不同的创建和查询性能,例如基于 `Btree` 数据结构的 `object` 在查询方面具有较高性能表现,`LinkedList` 或 `Vec` 在创建方面具有较高性能表现。 -2. `ylong_json` 适配了第三方库 `serde` -- `Serializer`:序列化输出的辅助结构类型。 -- `Deserializer`:反序列化输出的辅助结构类型。 -- 序列化功能:支持将实现了 `serde::Serialize` trait 的类型实例序列化为 `json` 文本内容或将内容写到输出流中。 -- 反序列化功能:若实现了 `serde::Deserialize` trait 的类型在 `json` 内容中,则可将该部分 `json` 内容反序列化为该类型的实例。 +`serde` 子模块提供了基于 `serde` 第三方库提供的 `Serialize` 和 `Deserialize` trait 的过程宏功能,可以支持用户结构体和 `JSON` 文本的快速转换。 +`serde` 相较于 `JsonValue` 的优势是使用便捷,用户无需将 `JSON` 文本先转换为 `JsonValue` 再从其中取出指定数据生成 `Rust` 结构体,只需给结构体设定 `Serialize` 和 `Deserialize` 过程宏标签,即可使用 `ylong_json` 中提供的接口结构体序列化成 `JSON` 文本,或将对应的 `JSON` 文本转换为用户结构体。 + +C-ffi 模块提供了基于 `JsonValue` 模块的 C 接口层,方便用户使用 C 接口调用 `ylong_json` 库的功能。 ## 目录 ``` @@ -73,39 +77,47 @@ external_deps = ["ylong_json:lib"] 详情内容请见[用户指南](./docs/user_guide_zh.md) ## 性能测试 + +以下测试来源于 [`nativejson-benchmark`](https://www.github.com/miloyip/nativejson-benchmark)。 + +测试环境信息如下: ``` -1.测试环境 操作系统:Linux 架构:x86_64 字节序:小端 CPU 型号:Intel(R) Xeon(R) Gold 6278C CPU @ 2.60GHz CPU 核心数:8 -内存:16G - -2.测试结果 -| 序列化 | ylong_json | serde_json | ------------------------------------------------ -| null | 150 ns/iter | 175 ns/iter | -| boolean | 155 ns/iter | 178 ns/iter | -| number | 309 ns/iter | 291 ns/iter | -| string | 513 ns/iter | 413 ns/iter | -| array | 998 ns/iter | 1,075 ns/iter | -| object | 1,333 ns/iter | 1,348 ns/iter | -| example1 | 12,537 ns/iter | 12,288 ns/iter | -| example2 | 23,754 ns/iter | 21,936 ns/iter | -| example3 | 103,061 ns/iter | 97,247 ns/iter | -| example4 | 15,234 ns/iter | 17,895 ns/iter | - -| 反序列化 | ylong_json | serde_json | ------------------------------------------------ -| null | 257 ns/iter | 399 ns/iter | -| boolean | 260 ns/iter | 400 ns/iter | -| number | 1,507 ns/iter | 989 ns/iter | -| string | 414 ns/iter | 610 ns/iter | -| array | 2,258 ns/iter | 2,148 ns/iter | -| object | 810 ns/iter | 1,386 ns/iter | -| example1 | 10,191 ns/iter | 10,227 ns/iter | -| example2 | 15,753 ns/iter | 18,022 ns/iter | -| example3 | 55,910 ns/iter | 59,717 ns/iter | -| example4 | 18,461 ns/iter | 12,471 ns/iter | +内存:8G ``` + +测试的软件版本: + +cJSON 1.7.16 + +测试结果: +``` +======= ylong-json ==== parse | stringify ==== +canada.json 200 MB/s 90 MB/s +citm_catalog.json 450 MB/s 300 MB/s +twitter.json 340 MB/s 520 MB/s + +======== cJSON ======== parse | stringify ==== +canada.json 55 MB/s 11 MB/s +citm_catalog.json 260 MB/s 170 MB/s +twitter.json 210 MB/s 210 MB/s +``` + +测试结果描述: + +在 `nativejson-benchmark` 测试中提供了三种测试文件,其中 `canada.json` 包含了大量的 `number` 结构,`citm_catalog.json` 的各种数据类型较为平均,`twitter.json` 中存在各种 `UTF-8` 字符。 +为了保证测试公平性,`ylong_json` 开启了 `list_object` 和 `list_array` 以及 `ascii_only` feature。 +`list_object` 和 `list_array` feature 主要是为了保证和 `cJSON` 数据结构层面一致,都使用链表实现。 +`ascii_only` feature 是为了保证针对 `UTF-8` 字符的处理逻辑一致,`cJSON` 对于 UTF-8 字符不做处理。 + +测试流程如下: + - 读取文件内容到内存,得到文件内容 `content`。 + - 调用指定 `JSON` 库反序列化接口生成对应的 `JSON` 结构体 `data`。 + - 调用 `JSON` 结构体的序列化接口生成输出内容 `result`。 + - 利用 `content`,循环反序列化生成 100 次 `JSON` 结构体,取较小的处理时间 `t1`。 + - 利用 `data`,序列化生成 100 次 `JSON` 文本,取较小的处理时间 `t2`。 + - 计算解析速度,反序列化时间为 `content` 的长度除以 `t1`,序列化时间为 `JSON` 文本长度除以 `t2`。 \ No newline at end of file diff --git a/figures/ylong_json_inner_structure.png b/figures/ylong_json_inner_structure.png index 9d526af..4fa7daa 100644 Binary files a/figures/ylong_json_inner_structure.png and b/figures/ylong_json_inner_structure.png differ diff --git a/figures/ylong_json_oh_relate.png b/figures/ylong_json_oh_relate.png index 95f064c..35f5cfb 100644 Binary files a/figures/ylong_json_oh_relate.png and b/figures/ylong_json_oh_relate.png differ