Traditional Chinese corpus collection for LLM training (pre-training, instruction-tuning, and RLHF/alignment).
Oscar, Li
liswei
AI & ML interests
Multimodal Deep Learning, Natural Language Processing, Efficient Fine-Tuning
Organizations
models 8
liswei/emojilm-0.6b-GGUF
0.6B • Updated • 20
liswei/emojilm-0.6b
0.6B • Updated • 2
liswei/Taiwan-ELM
Updated
liswei/Taiwan-ELM-1_1B-Instruct
Text Generation • 1B • Updated • 5 • 1
liswei/Taiwan-ELM-270M-Instruct
Text Generation • 0.3B • Updated • 11 • 1
liswei/Taiwan-ELM-1_1B
Text Generation • 1B • Updated • 4 • 1
liswei/Taiwan-ELM-270M
Text Generation • 0.3B • Updated • 22 • 2
liswei/EmojiLMSeq2SeqLoRA
0.6B • Updated • 3
datasets 10
liswei/Taiwan-Text-Excellence-2B
Viewer • Updated • 1.78M • 18 • 20
liswei/PromptPair-TW
Viewer • Updated • 119k • 12 • 2
liswei/news-collection-zhtw
Viewer • Updated • 592k • 99 • 3
liswei/wikinews-zhtw-dedup
Viewer • Updated • 8.37k • 11
liswei/wikipedia-zhtw-dedup
Viewer • Updated • 1.18M • 38 • 3
liswei/common-crawl-zhtw
Viewer • Updated • 2.71M • 63 • 6
liswei/coct-en-zhtw-dedup
Viewer • Updated • 217k • 7 • 2
liswei/c4-zhtw
Viewer • Updated • 4.86M • 41 • 3
liswei/rm-static-zhTW
Viewer • Updated • 81.4k • 17 • 30
liswei/NTU-Tree
Viewer • Updated • 478 • 55 • 4