Skip to content

rogerespel/ewts-js

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tibetan Phonetics and Transliteration

This JavaScript package implements two things:

Installation

npm install tibetan-ewts-converter

As of version 2, this is a pure ES module.

Usage

Wylie/EWTS conversion:

import { EwtsConverter } from 'tibetan-ewts-converter/EwtsConverter';
const ewts = new EwtsConverter();
console.log(ewts.to_unicode("sangs rgyas"));
console.log(ewts.to_ewts("སངས་རྒྱས"));

Approximate phonetics:

import { get_phonetics } from 'tibetan-ewts-converter';
const pho = get_phonetics({ style: "lotsawahouse", lang: "en" });
console.log(pho.phonetics("sangs rgyas", { autosplit: true }));

EwtsConverter options

The constructor accepts an optional object with named options:

  • check: generate warnings for illegal consonant sequences and the like; default is true.
  • check_strict: stricter checking, examine the whole stack; default is true.
  • fix_spacing: remove spaces after newlines, collapse multiple tseks into one, fix case, etc; default is true.
  • sloppy: silently fix a number of common Wylie mistakes when converting to Unicode; default is false
  • leave_dubious: when converting to Unicode, leave dubious syllables unprocessed, between [brackets], instead of doing a best attempt; default is false
  • pass_through: when converting to EWTS, pass through non-Tib characters instead of converting to [comments]; default is false
let ewts = new EwtsConverter({ check_strict: false, leave_dubious: true, sloppy: true });

TibetanPhonetics options

get_phonetics accepts an optional object with named options:

  • style: one of 'thl', 'lotsawahouse', 'rigpa', 'lhasey', 'padmakara'
  • lang: 2-letter language code, for styles that have language variants (ex. 'en', 'es')

The phonetics method takes a string (Tibetan Unicode or EWTS), and an optional options object.

Unless you're using a better external tokenizer, always pass the option { autosplit: true }.

See the code for lots of other options allowing fine control of phonetics generation. You can also directly import and use the classes TibetanPhonetics, TibetanPhoneticsRigpa, TibetanPhoneticsLhasey and TibetanPhoneticsPadmakara.

Code and history

The first version of this code was written in Perl around 2008. In 2010 the EWTS/Unicode converter was ported to Java at the request of TBRC, now BDRC.

The Java code for phonetics was then ported to other languages by different groups:

Phonetics generation was added to this project in 2025, also ported from the original perl with the help of AI.

License

Apache 2.0.

About

Tibetan transliteration between EWTS and Unicode

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published