icu4c库中icudtl.dat文件剪裁

背景

在工程中需要把ansi编码转utf-8,引入icu4c库,默认编译出来的.dat文件有30多M,由于仅仅需在MacOS系统下要把Windows中的ansi编码转成utf-8,需要进行裁剪。

编译icu4c工程

源码下载:https://github.com/unicode-org/icu,本文基于71.1版本编译,icu分c和java两个版本,以下都是基于c版本编译。

1.在终端更改运行icu4c/source目录

cd icu4c/source

2.给对应编译脚本提供执行权限

chmod +x runConfigureICU configure install-sh

3.在source下创建编译目录并进入

mkdir buildMacOS && cd buildMacOS

4.执行编译前的配置,编译系统目标为MacOS

../runConfigureICU MacOSX

5.编译

gnumake

编译出来的icudtl.dat文件默认为33.4MB

裁剪icudtl.dat

1.在buildMac目录下创建filters.json文件,把所有的模块都移除,剩下conversion_mappings只支持ansi编码,内容如下

{
  "featureFilters": {
// Based on the ICU63 version of
    "brkitr_dictionaries": {
      "filterType": "exclude"
    },
// # List of break iterator files (brk).
    "brkitr_rules": {
      "filterType": "exclude"
    },
// Need to explicitly add "root"
    "brkitr_tree": { "filterType": "exclude" },
    "conversion_mappings": {
      "includelist": [
// UCM_SOURCE_CORE=...
        "windows-936-2000"
      ]
    },
    "coll_tree": { "filterType": "exclude" },
    "coll_ucadata": { "filterType": "exclude" },
    "confusables": { "filterType": "exclude" },
    "curr_tree": { "filterType": "exclude" },
    "lang_tree": { "filterType": "exclude" },
    "locales_tree": { "filterType": "exclude" },
    "misc": { "filterType": "exclude" },
    "normalization": { "filterType": "exclude" },
    "rbnf_tree": { "filterType": "exclude" },
    "rbnf_index": { "filterType": "exclude" },
    "region_tree": { "filterType": "exclude" },
    "stringprep": { "filterType": "exclude" },
    "translit": { "filterType": "exclude" },
    "unames": { "filterType": "exclude" },
    "unit_tree": { "filterType": "exclude" },
    "zone_tree": { "filterType": "exclude" }
  },
  "resourceFilters": [
    {
      "categories": [
        "brkitr_tree",
        "coll_tree",
        "curr_tree",
        "lang_tree",
        "region_tree",
        "unit_tree",
        "zone_tree"
      ],
      "rules": [ "-/Version" ]
    }
  ]
}

2.安装hjson解释库,如果不想使用带注释的json格式,可以把上面的//相关行删除也行,就不需要安装hjson

pip3 install --user hjson jsonschema

3.删除icu4c/source/buildMac/data下的所有文件,其它的保留,避免其它模块重新编译,只编译data模块就好了

4.需要把filters.json文件建立一个ICU_DATA_FILTER_FILE临时环境变量

export ICU_DATA_FILTER_FILE="/Users/nickname/icu-release-71-1/icu4c/source/buildMac/filters.json"

5.重新更改编译配置

../runConfigureICU MacOSX

#最终提到输出以下信息表示filters.json文件配置成功
#Note: Applying filters from /Users/nickname/icu-release-71-1/icu4c/source/buildMac/filters.json.

6.重新编译

gnumake

7.编译成功,最终剪裁icudtl.dat文件只有133KB

查看icudtl.dat所有支持的编码方式

Available converters: 4
0  name:UTF-8  alias: 0: UTF-8 1: unicode-1-1-utf-8 2: utf8 
1  name:utf-16be  alias: 0: utf-16be 
2  name:utf-16le  alias: 0: utf-16le 1: utf-16 
3  name:windows-936-2000  alias: 0: windows-936-2000 1: GBK 2: chinese 3: iso-ir-58 4: GB2312 5: GB_2312-80 6: gb_2312 7: csGB2312 8: csiso58gb231280 9: x-gbk 

相关推荐

  1. C语言常用的函数和头文件

    2024-01-19 23:26:03       53 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-01-19 23:26:03       94 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-01-19 23:26:03       101 阅读
  3. 在Django里面运行非项目文件

    2024-01-19 23:26:03       82 阅读
  4. Python语言-面向对象

    2024-01-19 23:26:03       91 阅读

热门阅读

  1. Spring⾥⽤到的设计模式

    2024-01-19 23:26:03       47 阅读
  2. Redis的安装与配置

    2024-01-19 23:26:03       48 阅读
  3. Spring集成MyBatis与MyBatis-Plus添加分页插件

    2024-01-19 23:26:03       55 阅读
  4. Redis 配置

    2024-01-19 23:26:03       47 阅读
  5. 倒数87天

    2024-01-19 23:26:03       53 阅读
  6. Linux中的高级权限

    2024-01-19 23:26:03       56 阅读
  7. 将数组转换为树形结构

    2024-01-19 23:26:03       52 阅读