Tokenizer
The tokenizer is a tool allowing to tokenize "title" and "content" field of tkeir document. This tools is a rest service. Tokenization depends on annotation model created by the tool stored in tkeir/thot/tasks/tokenizer/createAnnotationResouces.py This tools allows to create typed compound word list.
Tokenizer API
Tokenizer configuration
Example of Configuration:
{
"logger": {
"logging-level": "{{ project.loglevel }}"
},
"tokenizers": {
"segmenters":[{
"language":"en",
"resources-base-path":"{{ project.path }}/resources/modeling/tokenizer/en",
"mwe": "tkeir_mwe.pkl",
"normalization-rules":"tokenizer-rules.json",
"annotation-resources-reference":"annotation-resources.json"
}],
"network": {
"host":"0.0.0.0",
"port":10001,
"associate-environment": {
"host":"TOKENIZER_HOST",
"port":"TOKENIZER_PORT"
}
},
"runtime":{
"request-max-size":100000000,
"request-buffer-queue-size":100,
"keep-alive":true,
"keep-alive-timeout":500,
"graceful-shutown-timeout":15.0,
"request-timeout":600,
"response-timeout":600,
"workers":1
}
}
}
Tokenizer is an aggreation of network configuration, serialize configuration, runtime configuration (in field converter), logger (at top level). The segmenter configuration is a table containing path to Multiple Word Expression entries (MWE):
- language :the language of tokenizer
- resources-base-path: the path to the resources (containing file created by tools createAnnotationResources.py
- mwe : the file containing MWE entries
- normalization-rules : the file containing normalization rules
- annotation-resources-reference : reference to annotation file, needs on tokenizer init
Tokenizer accepts a rule file to select parser (not yet implemented), common typos fixing and word mapping (for example map english words to us words). The normalization rule is a simple json file with the following fields:
- parsers (NOT YET IMPLEMENTED) : the available parser (for exemple pyvalem to parse chemestry formulas)
- normalization/word-mapping: mapping words
- normalization/typos : typos fixing
{
"parsers": {
"on-document":["texsoup"],
"on-tokens":[
{"parsers":"chemparse","max-tokens-merge":50}
]
},
"normalization": {
"word-mapping" : [
{"from":"accessorise", "to":"accessorize"},
{"from":"accessorised", "to":"accessorized"},
{"from":"accessorises", "to":"accessorizes"},
{"from":"accessorising", "to":"accessorizing"},
{"from":"acclimatisation", "to":"acclimatization"},
{"from":"acclimatise", "to":"acclimatize"},
{"from":"acclimatised", "to":"acclimatized"},
{"from":"acclimatises", "to":"acclimatizes"},
{"from":"acclimatising", "to":"acclimatizing"},
{"from":"accoutrements", "to":"accouterments"},
{"from":"aeon", "to":"eon"},
{"from":"aeons", "to":"eons"},
{"from":"aerogramme", "to":"aerogram"},
{"from":"aerogrammes", "to":"aerograms"},
{"from":"aeroplane", "to":"airplane"},
{"from":"aeroplanes", "to":"airplanes"},
{"from":"aesthete", "to":"esthete"},
{"from":"aesthetes", "to":"esthetes"},
{"from":"aesthetic", "to":"esthetic"},
{"from":"aesthetically", "to":"esthetically"},
{"from":"aesthetics", "to":"esthetics"},
{"from":"aetiology", "to":"etiology"},
{"from":"ageing", "to":"aging"},
{"from":"aggrandisement", "to":"aggrandizement"},
{"from":"agonise", "to":"agonize"},
{"from":"agonised", "to":"agonized"},
{"from":"agonises", "to":"agonizes"},
{"from":"agonising", "to":"agonizing"},
{"from":"agonisingly", "to":"agonizingly"},
{"from":"almanack", "to":"almanac"},
{"from":"almanacks", "to":"almanacs"},
{"from":"aluminium", "to":"aluminum"},
{"from":"amortisable", "to":"amortizable"},
{"from":"amortisation", "to":"amortization"},
{"from":"amortisations", "to":"amortizations"},
{"from":"amortise", "to":"amortize"},
{"from":"amortised", "to":"amortized"},
{"from":"amortises", "to":"amortizes"},
{"from":"amortising", "to":"amortizing"},
{"from":"amphitheatre", "to":"amphitheater"},
{"from":"amphitheatres", "to":"amphitheaters"},
{"from":"anaemia", "to":"anemia"},
{"from":"anaemic", "to":"anemic"},
{"from":"anaesthesia", "to":"anesthesia"},
{"from":"anaesthetic", "to":"anesthetic"},
{"from":"anaesthetics", "to":"anesthetics"},
{"from":"anaesthetise", "to":"anesthetize"},
{"from":"anaesthetised", "to":"anesthetized"},
{"from":"anaesthetises", "to":"anesthetizes"},
{"from":"anaesthetising", "to":"anesthetizing"},
{"from":"anaesthetist", "to":"anesthetist"},
{"from":"anaesthetists", "to":"anesthetists"},
{"from":"anaesthetize", "to":"anesthetize"},
{"from":"anaesthetized", "to":"anesthetized"},
{"from":"anaesthetizes", "to":"anesthetizes"},
{"from":"anaesthetizing", "to":"anesthetizing"},
{"from":"analogue", "to":"analog"},
{"from":"analogues", "to":"analogs"},
{"from":"analyse", "to":"analyze"},
{"from":"analysed", "to":"analyzed"},
{"from":"analyses", "to":"analyzes"},
{"from":"analysing", "to":"analyzing"},
{"from":"anglicise", "to":"anglicize"},
{"from":"anglicised", "to":"anglicized"},
{"from":"anglicises", "to":"anglicizes"},
{"from":"anglicising", "to":"anglicizing"},
{"from":"annualised", "to":"annualized"},
{"from":"antagonise", "to":"antagonize"},
{"from":"antagonised", "to":"antagonized"},
{"from":"antagonises", "to":"antagonizes"},
{"from":"antagonising", "to":"antagonizing"},
{"from":"apologise", "to":"apologize"},
{"from":"apologised", "to":"apologized"},
{"from":"apologises", "to":"apologizes"},
{"from":"apologising", "to":"apologizing"},
{"from":"appal", "to":"appall"},
{"from":"appals", "to":"appalls"},
{"from":"appetiser", "to":"appetizer"},
{"from":"appetisers", "to":"appetizers"},
{"from":"appetising", "to":"appetizing"},
{"from":"appetisingly", "to":"appetizingly"},
{"from":"arbour", "to":"arbor"},
{"from":"arbours", "to":"arbors"},
{"from":"archaeological", "to":"archeological"},
{"from":"archaeologically", "to":"archeologically"},
{"from":"archaeologist", "to":"archeologist"},
{"from":"archaeologists", "to":"archeologists"},
{"from":"archaeology", "to":"archeology"},
{"from":"ardour", "to":"ardor"},
{"from":"armour", "to":"armor"},
{"from":"armoured", "to":"armored"},
{"from":"armourer", "to":"armorer"},
{"from":"armourers", "to":"armorers"},
{"from":"armouries", "to":"armories"},
{"from":"armoury", "to":"armory"},
{"from":"artefact", "to":"artifact"},
{"from":"artefacts", "to":"artifacts"},
{"from":"authorise", "to":"authorize"},
{"from":"authorised", "to":"authorized"},
{"from":"authorises", "to":"authorizes"},
{"from":"authorising", "to":"authorizing"},
{"from":"axe", "to":"ax"},
{"from":"backpedalled", "to":"backpedaled"},
{"from":"backpedalling", "to":"backpedaling"},
{"from":"bannister", "to":"banister"},
{"from":"bannisters", "to":"banisters"},
{"from":"baptise", "to":"baptize"},
{"from":"baptised", "to":"baptized"},
{"from":"baptises", "to":"baptizes"},
{"from":"baptising", "to":"baptizing"},
{"from":"bastardise", "to":"bastardize"},
{"from":"bastardised", "to":"bastardized"},
{"from":"bastardises", "to":"bastardizes"},
{"from":"bastardising", "to":"bastardizing"},
{"from":"battleaxe", "to":"battleax"},
{"from":"baulk", "to":"balk"},
{"from":"baulked", "to":"balked"},
{"from":"baulking", "to":"balking"},
{"from":"baulks", "to":"balks"},
{"from":"bedevilled", "to":"bedeviled"},
{"from":"bedevilling", "to":"bedeviling"},
{"from":"behaviour", "to":"behavior"},
{"from":"behavioural", "to":"behavioral"},
{"from":"behaviourism", "to":"behaviorism"},
{"from":"behaviourist", "to":"behaviorist"},
{"from":"behaviourists", "to":"behaviorists"},
{"from":"behaviours", "to":"behaviors"},
{"from":"behove", "to":"behoove"},
{"from":"behoved", "to":"behooved"},
{"from":"behoves", "to":"behooves"},
{"from":"bejewelled", "to":"bejeweled"},
{"from":"belabour", "to":"belabor"},
{"from":"belaboured", "to":"belabored"},
{"from":"belabouring", "to":"belaboring"},
{"from":"belabours", "to":"belabors"},
{"from":"bevelled", "to":"beveled"},
{"from":"bevvies", "to":"bevies"},
{"from":"bevvy", "to":"bevy"},
{"from":"biassed", "to":"biased"},
{"from":"biassing", "to":"biasing"},
{"from":"bingeing", "to":"binging"},
{"from":"bougainvillaea", "to":"bougainvillea"},
{"from":"bougainvillaeas", "to":"bougainvilleas"},
{"from":"bowdlerise", "to":"bowdlerize"},
{"from":"bowdlerised", "to":"bowdlerized"},
{"from":"bowdlerises", "to":"bowdlerizes"},
{"from":"bowdlerising", "to":"bowdlerizing"},
{"from":"breathalyse", "to":"breathalyze"},
{"from":"breathalysed", "to":"breathalyzed"},
{"from":"breathalyser", "to":"breathalyzer"},
{"from":"breathalysers", "to":"breathalyzers"},
{"from":"breathalyses", "to":"breathalyzes"},
{"from":"breathalysing", "to":"breathalyzing"},
{"from":"brutalise", "to":"brutalize"},
{"from":"brutalised", "to":"brutalized"},
{"from":"brutalises", "to":"brutalizes"},
{"from":"brutalising", "to":"brutalizing"},
{"from":"buses", "to":"busses"},
{"from":"busing", "to":"bussing"},
{"from":"caesarean", "to":"cesarean"},
{"from":"caesareans", "to":"cesareans"},
{"from":"calibre", "to":"caliber"},
{"from":"calibres", "to":"calibers"},
{"from":"calliper", "to":"caliper"},
{"from":"callipers", "to":"calipers"},
{"from":"callisthenics", "to":"calisthenics"},
{"from":"canalise", "to":"canalize"},
{"from":"canalised", "to":"canalized"},
{"from":"canalises", "to":"canalizes"},
{"from":"canalising", "to":"canalizing"},
{"from":"cancellation", "to":"cancelation"},
{"from":"cancellations", "to":"cancelations"},
{"from":"cancelled", "to":"canceled"},
{"from":"cancelling", "to":"canceling"},
{"from":"candour", "to":"candor"},
{"from":"cannibalise", "to":"cannibalize"},
{"from":"cannibalised", "to":"cannibalized"},
{"from":"cannibalises", "to":"cannibalizes"},
{"from":"cannibalising", "to":"cannibalizing"},
{"from":"canonise", "to":"canonize"},
{"from":"canonised", "to":"canonized"},
{"from":"canonises", "to":"canonizes"},
{"from":"canonising", "to":"canonizing"},
{"from":"capitalise", "to":"capitalize"},
{"from":"capitalised", "to":"capitalized"},
{"from":"capitalises", "to":"capitalizes"},
{"from":"capitalising", "to":"capitalizing"},
{"from":"caramelise", "to":"caramelize"},
{"from":"caramelised", "to":"caramelized"},
{"from":"caramelises", "to":"caramelizes"},
{"from":"caramelising", "to":"caramelizing"},
{"from":"carbonise", "to":"carbonize"},
{"from":"carbonised", "to":"carbonized"},
{"from":"carbonises", "to":"carbonizes"},
{"from":"carbonising", "to":"carbonizing"},
{"from":"carolled", "to":"caroled"},
{"from":"carolling", "to":"caroling"},
{"from":"catalogue", "to":"catalog"},
{"from":"catalogued", "to":"cataloged"},
{"from":"catalogues", "to":"catalogs"},
{"from":"cataloguing", "to":"cataloging"},
{"from":"catalyse", "to":"catalyze"},
{"from":"catalysed", "to":"catalyzed"},
{"from":"catalyses", "to":"catalyzes"},
{"from":"catalysing", "to":"catalyzing"},
{"from":"categorise", "to":"categorize"},
{"from":"categorised", "to":"categorized"},
{"from":"categorises", "to":"categorizes"},
{"from":"categorising", "to":"categorizing"},
{"from":"cauterise", "to":"cauterize"},
{"from":"cauterised", "to":"cauterized"},
{"from":"cauterises", "to":"cauterizes"},
{"from":"cauterising", "to":"cauterizing"},
{"from":"cavilled", "to":"caviled"},
{"from":"cavilling", "to":"caviling"},
{"from":"centigramme", "to":"centigram"},
{"from":"centigrammes", "to":"centigrams"},
{"from":"centilitre", "to":"centiliter"},
{"from":"centilitres", "to":"centiliters"},
{"from":"centimetre", "to":"centimeter"},
{"from":"centimetres", "to":"centimeters"},
{"from":"centralise", "to":"centralize"},
{"from":"centralised", "to":"centralized"},
{"from":"centralises", "to":"centralizes"},
{"from":"centralising", "to":"centralizing"},
{"from":"centre", "to":"center"},
{"from":"centred", "to":"centered"},
{"from":"centrefold", "to":"centerfold"},
{"from":"centrefolds", "to":"centerfolds"},
{"from":"centrepiece", "to":"centerpiece"},
{"from":"centrepieces", "to":"centerpieces"},
{"from":"centres", "to":"centers"},
{"from":"channelled", "to":"channeled"},
{"from":"channelling", "to":"channeling"},
{"from":"characterise", "to":"characterize"},
{"from":"characterised", "to":"characterized"},
{"from":"characterises", "to":"characterizes"},
{"from":"characterising", "to":"characterizing"},
{"from":"cheque", "to":"check"},
{"from":"chequebook", "to":"checkbook"},
{"from":"chequebooks", "to":"checkbooks"},
{"from":"chequered", "to":"checkered"},
{"from":"cheques", "to":"checks"},
{"from":"chilli", "to":"chili"},
{"from":"chimaera", "to":"chimera"},
{"from":"chimaeras", "to":"chimeras"},
{"from":"chiselled", "to":"chiseled"},
{"from":"chiselling", "to":"chiseling"},
{"from":"circularise", "to":"circularize"},
{"from":"circularised", "to":"circularized"},
{"from":"circularises", "to":"circularizes"},
{"from":"circularising", "to":"circularizing"},
{"from":"civilise", "to":"civilize"},
{"from":"civilised", "to":"civilized"},
{"from":"civilises", "to":"civilizes"},
{"from":"civilising", "to":"civilizing"},
{"from":"clamour", "to":"clamor"},
{"from":"clamoured", "to":"clamored"},
{"from":"clamouring", "to":"clamoring"},
{"from":"clamours", "to":"clamors"},
{"from":"clangour", "to":"clangor"},
{"from":"clarinettist", "to":"clarinetist"},
{"from":"clarinettists", "to":"clarinetists"},
{"from":"collectivise", "to":"collectivize"},
{"from":"collectivised", "to":"collectivized"},
{"from":"collectivises", "to":"collectivizes"},
{"from":"collectivising", "to":"collectivizing"},
{"from":"colonisation", "to":"colonization"},
{"from":"colonise", "to":"colonize"},
{"from":"colonised", "to":"colonized"},
{"from":"coloniser", "to":"colonizer"},
{"from":"colonisers", "to":"colonizers"},
{"from":"colonises", "to":"colonizes"},
{"from":"colonising", "to":"colonizing"},
{"from":"colour", "to":"color"},
{"from":"colourant", "to":"colorant"},
{"from":"colourants", "to":"colorants"},
{"from":"coloured", "to":"colored"},
{"from":"coloureds", "to":"coloreds"},
{"from":"colourful", "to":"colorful"},
{"from":"colourfully", "to":"colorfully"},
{"from":"colouring", "to":"coloring"},
{"from":"colourize", "to":"colorize"},
{"from":"colourized", "to":"colorized"},
{"from":"colourizes", "to":"colorizes"},
{"from":"colourizing", "to":"colorizing"},
{"from":"colourless", "to":"colorless"},
{"from":"colours", "to":"colors"},
{"from":"commercialise", "to":"commercialize"},
{"from":"commercialised", "to":"commercialized"},
{"from":"commercialises", "to":"commercializes"},
{"from":"commercialising", "to":"commercializing"},
{"from":"compartmentalise", "to":"compartmentalize"},
{"from":"compartmentalised", "to":"compartmentalized"},
{"from":"compartmentalises", "to":"compartmentalizes"},
{"from":"compartmentalising", "to":"compartmentalizing"},
{"from":"computerise", "to":"computerize"},
{"from":"computerised", "to":"computerized"},
{"from":"computerises", "to":"computerizes"},
{"from":"computerising", "to":"computerizing"},
{"from":"conceptualise", "to":"conceptualize"},
{"from":"conceptualised", "to":"conceptualized"},
{"from":"conceptualises", "to":"conceptualizes"},
{"from":"conceptualising", "to":"conceptualizing"},
{"from":"connexion", "to":"connection"},
{"from":"connexions", "to":"connections"},
{"from":"contextualise", "to":"contextualize"},
{"from":"contextualised", "to":"contextualized"},
{"from":"contextualises", "to":"contextualizes"},
{"from":"contextualising", "to":"contextualizing"},
{"from":"cosier", "to":"cozier"},
{"from":"cosies", "to":"cozies"},
{"from":"cosiest", "to":"coziest"},
{"from":"cosily", "to":"cozily"},
{"from":"cosiness", "to":"coziness"},
{"from":"cosy", "to":"cozy"},
{"from":"councillor", "to":"councilor"},
{"from":"councillors", "to":"councilors"},
{"from":"counselled", "to":"counseled"},
{"from":"counselling", "to":"counseling"},
{"from":"counsellor", "to":"counselor"},
{"from":"counsellors", "to":"counselors"},
{"from":"crenellated", "to":"crenelated"},
{"from":"criminalise", "to":"criminalize"},
{"from":"criminalised", "to":"criminalized"},
{"from":"criminalises", "to":"criminalizes"},
{"from":"criminalising", "to":"criminalizing"},
{"from":"criticise", "to":"criticize"},
{"from":"criticised", "to":"criticized"},
{"from":"criticises", "to":"criticizes"},
{"from":"criticising", "to":"criticizing"},
{"from":"crueller", "to":"crueler"},
{"from":"cruellest", "to":"cruelest"},
{"from":"crystallisation", "to":"crystallization"},
{"from":"crystallise", "to":"crystallize"},
{"from":"crystallised", "to":"crystallized"},
{"from":"crystallises", "to":"crystallizes"},
{"from":"crystallising", "to":"crystallizing"},
{"from":"cudgelled", "to":"cudgeled"},
{"from":"cudgelling", "to":"cudgeling"},
{"from":"customise", "to":"customize"},
{"from":"customised", "to":"customized"},
{"from":"customises", "to":"customizes"},
{"from":"customising", "to":"customizing"},
{"from":"cypher", "to":"cipher"},
{"from":"cyphers", "to":"ciphers"},
{"from":"decentralisation", "to":"decentralization"},
{"from":"decentralise", "to":"decentralize"},
{"from":"decentralised", "to":"decentralized"},
{"from":"decentralises", "to":"decentralizes"},
{"from":"decentralising", "to":"decentralizing"},
{"from":"decriminalisation", "to":"decriminalization"},
{"from":"decriminalise", "to":"decriminalize"},
{"from":"decriminalised", "to":"decriminalized"},
{"from":"decriminalises", "to":"decriminalizes"},
{"from":"decriminalising", "to":"decriminalizing"},
{"from":"defence", "to":"defense"},
{"from":"defenceless", "to":"defenseless"},
{"from":"defences", "to":"defenses"},
{"from":"dehumanisation", "to":"dehumanization"},
{"from":"dehumanise", "to":"dehumanize"},
{"from":"dehumanised", "to":"dehumanized"},
{"from":"dehumanises", "to":"dehumanizes"},
{"from":"dehumanising", "to":"dehumanizing"},
{"from":"demeanour", "to":"demeanor"},
{"from":"demilitarisation", "to":"demilitarization"},
{"from":"demilitarise", "to":"demilitarize"},
{"from":"demilitarised", "to":"demilitarized"},
{"from":"demilitarises", "to":"demilitarizes"},
{"from":"demilitarising", "to":"demilitarizing"},
{"from":"demobilisation", "to":"demobilization"},
{"from":"demobilise", "to":"demobilize"},
{"from":"demobilised", "to":"demobilized"},
{"from":"demobilises", "to":"demobilizes"},
{"from":"demobilising", "to":"demobilizing"},
{"from":"democratisation", "to":"democratization"},
{"from":"democratise", "to":"democratize"},
{"from":"democratised", "to":"democratized"},
{"from":"democratises", "to":"democratizes"},
{"from":"democratising", "to":"democratizing"},
{"from":"demonise", "to":"demonize"},
{"from":"demonised", "to":"demonized"},
{"from":"demonises", "to":"demonizes"},
{"from":"demonising", "to":"demonizing"},
{"from":"demoralisation", "to":"demoralization"},
{"from":"demoralise", "to":"demoralize"},
{"from":"demoralised", "to":"demoralized"},
{"from":"demoralises", "to":"demoralizes"},
{"from":"demoralising", "to":"demoralizing"},
{"from":"denationalisation", "to":"denationalization"},
{"from":"denationalise", "to":"denationalize"},
{"from":"denationalised", "to":"denationalized"},
{"from":"denationalises", "to":"denationalizes"},
{"from":"denationalising", "to":"denationalizing"},
{"from":"deodorise", "to":"deodorize"},
{"from":"deodorised", "to":"deodorized"},
{"from":"deodorises", "to":"deodorizes"},
{"from":"deodorising", "to":"deodorizing"},
{"from":"depersonalise", "to":"depersonalize"},
{"from":"depersonalised", "to":"depersonalized"},
{"from":"depersonalises", "to":"depersonalizes"},
{"from":"depersonalising", "to":"depersonalizing"},
{"from":"deputise", "to":"deputize"},
{"from":"deputised", "to":"deputized"},
{"from":"deputises", "to":"deputizes"},
{"from":"deputising", "to":"deputizing"},
{"from":"desensitisation", "to":"desensitization"},
{"from":"desensitise", "to":"desensitize"},
{"from":"desensitised", "to":"desensitized"},
{"from":"desensitises", "to":"desensitizes"},
{"from":"desensitising", "to":"desensitizing"},
{"from":"destabilisation", "to":"destabilization"},
{"from":"destabilise", "to":"destabilize"},
{"from":"destabilised", "to":"destabilized"},
{"from":"destabilises", "to":"destabilizes"},
{"from":"destabilising", "to":"destabilizing"},
{"from":"dialled", "to":"dialed"},
{"from":"dialling", "to":"dialing"},
{"from":"dialogue", "to":"dialog"},
{"from":"dialogues", "to":"dialogs"},
{"from":"diarrhoea", "to":"diarrhea"},
{"from":"digitise", "to":"digitize"},
{"from":"digitised", "to":"digitized"},
{"from":"digitises", "to":"digitizes"},
{"from":"digitising", "to":"digitizing"},
{"from":"disc", "to":"disk"},
{"from":"discolour", "to":"discolor"},
{"from":"discoloured", "to":"discolored"},
{"from":"discolouring", "to":"discoloring"},
{"from":"discolours", "to":"discolors"},
{"from":"discs", "to":"disks"},
{"from":"disembowelled", "to":"disemboweled"},
{"from":"disembowelling", "to":"disemboweling"},
{"from":"disfavour", "to":"disfavor"},
{"from":"dishevelled", "to":"disheveled"},
{"from":"dishonour", "to":"dishonor"},
{"from":"dishonourable", "to":"dishonorable"},
{"from":"dishonourably", "to":"dishonorably"},
{"from":"dishonoured", "to":"dishonored"},
{"from":"dishonouring", "to":"dishonoring"},
{"from":"dishonours", "to":"dishonors"},
{"from":"disorganisation", "to":"disorganization"},
{"from":"disorganised", "to":"disorganized"},
{"from":"distil", "to":"distill"},
{"from":"distils", "to":"distills"},
{"from":"dramatisation", "to":"dramatization"},
{"from":"dramatisations", "to":"dramatizations"},
{"from":"dramatise", "to":"dramatize"},
{"from":"dramatised", "to":"dramatized"},
{"from":"dramatises", "to":"dramatizes"},
{"from":"dramatising", "to":"dramatizing"},
{"from":"draught", "to":"draft"},
{"from":"draughtboard", "to":"draftboard"},
{"from":"draughtboards", "to":"draftboards"},
{"from":"draughtier", "to":"draftier"},
{"from":"draughtiest", "to":"draftiest"},
{"from":"draughts", "to":"drafts"},
{"from":"draughtsman", "to":"draftsman"},
{"from":"draughtsmanship", "to":"draftsmanship"},
{"from":"draughtsmen", "to":"draftsmen"},
{"from":"draughtswoman", "to":"draftswoman"},
{"from":"draughtswomen", "to":"draftswomen"},
{"from":"draughty", "to":"drafty"},
{"from":"drivelled", "to":"driveled"},
{"from":"drivelling", "to":"driveling"},
{"from":"duelled", "to":"dueled"},
{"from":"duelling", "to":"dueling"},
{"from":"economise", "to":"economize"},
{"from":"economised", "to":"economized"},
{"from":"economises", "to":"economizes"},
{"from":"economising", "to":"economizing"},
{"from":"edoema", "to":"edema"},
{"from":"editorialise", "to":"editorialize"},
{"from":"editorialised", "to":"editorialized"},
{"from":"editorialises", "to":"editorializes"},
{"from":"editorialising", "to":"editorializing"},
{"from":"empathise", "to":"empathize"},
{"from":"empathised", "to":"empathized"},
{"from":"empathises", "to":"empathizes"},
{"from":"empathising", "to":"empathizing"},
{"from":"emphasise", "to":"emphasize"},
{"from":"emphasised", "to":"emphasized"},
{"from":"emphasises", "to":"emphasizes"},
{"from":"emphasising", "to":"emphasizing"},
{"from":"enamelled", "to":"enameled"},
{"from":"enamelling", "to":"enameling"},
{"from":"enamoured", "to":"enamored"},
{"from":"encyclopaedia", "to":"encyclopedia"},
{"from":"encyclopaedias", "to":"encyclopedias"},
{"from":"encyclopaedic", "to":"encyclopedic"},
{"from":"endeavour", "to":"endeavor"},
{"from":"endeavoured", "to":"endeavored"},
{"from":"endeavouring", "to":"endeavoring"},
{"from":"endeavours", "to":"endeavors"},
{"from":"energise", "to":"energize"},
{"from":"energised", "to":"energized"},
{"from":"energises", "to":"energizes"},
{"from":"energising", "to":"energizing"},
{"from":"enrol", "to":"enroll"},
{"from":"enrols", "to":"enrolls"},
{"from":"enthral", "to":"enthrall"},
{"from":"enthrals", "to":"enthralls"},
{"from":"epaulette", "to":"epaulet"},
{"from":"epaulettes", "to":"epaulets"},
{"from":"epicentre", "to":"epicenter"},
{"from":"epicentres", "to":"epicenters"},
{"from":"epilogue", "to":"epilog"},
{"from":"epilogues", "to":"epilogs"},
{"from":"epitomise", "to":"epitomize"},
{"from":"epitomised", "to":"epitomized"},
{"from":"epitomises", "to":"epitomizes"},
{"from":"epitomising", "to":"epitomizing"},
{"from":"equalisation", "to":"equalization"},
{"from":"equalise", "to":"equalize"},
{"from":"equalised", "to":"equalized"},
{"from":"equaliser", "to":"equalizer"},
{"from":"equalisers", "to":"equalizers"},
{"from":"equalises", "to":"equalizes"},
{"from":"equalising", "to":"equalizing"},
{"from":"eulogise", "to":"eulogize"},
{"from":"eulogised", "to":"eulogized"},
{"from":"eulogises", "to":"eulogizes"},
{"from":"eulogising", "to":"eulogizing"},
{"from":"evangelise", "to":"evangelize"},
{"from":"evangelised", "to":"evangelized"},
{"from":"evangelises", "to":"evangelizes"},
{"from":"evangelising", "to":"evangelizing"},
{"from":"exorcise", "to":"exorcize"},
{"from":"exorcised", "to":"exorcized"},
{"from":"exorcises", "to":"exorcizes"},
{"from":"exorcising", "to":"exorcizing"},
{"from":"extemporisation", "to":"extemporization"},
{"from":"extemporise", "to":"extemporize"},
{"from":"extemporised", "to":"extemporized"},
{"from":"extemporises", "to":"extemporizes"},
{"from":"extemporising", "to":"extemporizing"},
{"from":"externalisation", "to":"externalization"},
{"from":"externalisations", "to":"externalizations"},
{"from":"externalise", "to":"externalize"},
{"from":"externalised", "to":"externalized"},
{"from":"externalises", "to":"externalizes"},
{"from":"externalising", "to":"externalizing"},
{"from":"factorise", "to":"factorize"},
{"from":"factorised", "to":"factorized"},
{"from":"factorises", "to":"factorizes"},
{"from":"factorising", "to":"factorizing"},
{"from":"faecal", "to":"fecal"},
{"from":"faeces", "to":"feces"},
{"from":"familiarisation", "to":"familiarization"},
{"from":"familiarise", "to":"familiarize"},
{"from":"familiarised", "to":"familiarized"},
{"from":"familiarises", "to":"familiarizes"},
{"from":"familiarising", "to":"familiarizing"},
{"from":"fantasise", "to":"fantasize"},
{"from":"fantasised", "to":"fantasized"},
{"from":"fantasises", "to":"fantasizes"},
{"from":"fantasising", "to":"fantasizing"},
{"from":"favour", "to":"favor"},
{"from":"favourable", "to":"favorable"},
{"from":"favourably", "to":"favorably"},
{"from":"favoured", "to":"favored"},
{"from":"favouring", "to":"favoring"},
{"from":"favourite", "to":"favorite"},
{"from":"favourites", "to":"favorites"},
{"from":"favouritism", "to":"favoritism"},
{"from":"favours", "to":"favors"},
{"from":"feminise", "to":"feminize"},
{"from":"feminised", "to":"feminized"},
{"from":"feminises", "to":"feminizes"},
{"from":"feminising", "to":"feminizing"},
{"from":"fertilisation", "to":"fertilization"},
{"from":"fertilise", "to":"fertilize"},
{"from":"fertilised", "to":"fertilized"},
{"from":"fertiliser", "to":"fertilizer"},
{"from":"fertilisers", "to":"fertilizers"},
{"from":"fertilises", "to":"fertilizes"},
{"from":"fertilising", "to":"fertilizing"},
{"from":"fervour", "to":"fervor"},
{"from":"fibre", "to":"fiber"},
{"from":"fibreglass", "to":"fiberglass"},
{"from":"fibres", "to":"fibers"},
{"from":"fictionalisation", "to":"fictionalization"},
{"from":"fictionalisations", "to":"fictionalizations"},
{"from":"fictionalise", "to":"fictionalize"},
{"from":"fictionalised", "to":"fictionalized"},
{"from":"fictionalises", "to":"fictionalizes"},
{"from":"fictionalising", "to":"fictionalizing"},
{"from":"fillet", "to":"filet"},
{"from":"filleted", "to":"fileted"},
{"from":"filleting", "to":"fileting"},
{"from":"fillets", "to":"filets"},
{"from":"finalisation", "to":"finalization"},
{"from":"finalise", "to":"finalize"},
{"from":"finalised", "to":"finalized"},
{"from":"finalises", "to":"finalizes"},
{"from":"finalising", "to":"finalizing"},
{"from":"flautist", "to":"flutist"},
{"from":"flautists", "to":"flutists"},
{"from":"flavour", "to":"flavor"},
{"from":"flavoured", "to":"flavored"},
{"from":"flavouring", "to":"flavoring"},
{"from":"flavourings", "to":"flavorings"},
{"from":"flavourless", "to":"flavorless"},
{"from":"flavours", "to":"flavors"},
{"from":"flavoursome", "to":"flavorsome"},
{"from":"flyer / flier", "to":"flier / flyer"},
{"from":"foetal", "to":"fetal"},
{"from":"foetid", "to":"fetid"},
{"from":"foetus", "to":"fetus"},
{"from":"foetuses", "to":"fetuses"},
{"from":"formalisation", "to":"formalization"},
{"from":"formalise", "to":"formalize"},
{"from":"formalised", "to":"formalized"},
{"from":"formalises", "to":"formalizes"},
{"from":"formalising", "to":"formalizing"},
{"from":"fossilisation", "to":"fossilization"},
{"from":"fossilise", "to":"fossilize"},
{"from":"fossilised", "to":"fossilized"},
{"from":"fossilises", "to":"fossilizes"},
{"from":"fossilising", "to":"fossilizing"},
{"from":"fraternisation", "to":"fraternization"},
{"from":"fraternise", "to":"fraternize"},
{"from":"fraternised", "to":"fraternized"},
{"from":"fraternises", "to":"fraternizes"},
{"from":"fraternising", "to":"fraternizing"},
{"from":"fulfil", "to":"fulfill"},
{"from":"fulfilment", "to":"fulfillment"},
{"from":"fulfils", "to":"fulfills"},
{"from":"funnelled", "to":"funneled"},
{"from":"funnelling", "to":"funneling"},
{"from":"galvanise", "to":"galvanize"},
{"from":"galvanised", "to":"galvanized"},
{"from":"galvanises", "to":"galvanizes"},
{"from":"galvanising", "to":"galvanizing"},
{"from":"gambolled", "to":"gamboled"},
{"from":"gambolling", "to":"gamboling"},
{"from":"gaol", "to":"jail"},
{"from":"gaolbird", "to":"jailbird"},
{"from":"gaolbirds", "to":"jailbirds"},
{"from":"gaolbreak", "to":"jailbreak"},
{"from":"gaolbreaks", "to":"jailbreaks"},
{"from":"gaoled", "to":"jailed"},
{"from":"gaoler", "to":"jailer"},
{"from":"gaolers", "to":"jailers"},
{"from":"gaoling", "to":"jailing"},
{"from":"gaols", "to":"jails"},
{"from":"gases", "to":"gasses"},
{"from":"gauge", "to":"gage"},
{"from":"gauged", "to":"gaged"},
{"from":"gauges", "to":"gages"},
{"from":"gauging", "to":"gaging"},
{"from":"generalisation", "to":"generalization"},
{"from":"generalisations", "to":"generalizations"},
{"from":"generalise", "to":"generalize"},
{"from":"generalised", "to":"generalized"},
{"from":"generalises", "to":"generalizes"},
{"from":"generalising", "to":"generalizing"},
{"from":"ghettoise", "to":"ghettoize"},
{"from":"ghettoised", "to":"ghettoized"},
{"from":"ghettoises", "to":"ghettoizes"},
{"from":"ghettoising", "to":"ghettoizing"},
{"from":"gipsies", "to":"gypsies"},
{"from":"glamorise", "to":"glamorize"},
{"from":"glamorised", "to":"glamorized"},
{"from":"glamorises", "to":"glamorizes"},
{"from":"glamorising", "to":"glamorizing"},
{"from":"glamour", "to":"glamor"},
{"from":"globalisation", "to":"globalization"},
{"from":"globalise", "to":"globalize"},
{"from":"globalised", "to":"globalized"},
{"from":"globalises", "to":"globalizes"},
{"from":"globalising", "to":"globalizing"},
{"from":"glueing", "to":"gluing"},
{"from":"goitre", "to":"goiter"},
{"from":"goitres", "to":"goiters"},
{"from":"gonorrhoea", "to":"gonorrhea"},
{"from":"gramme", "to":"gram"},
{"from":"grammes", "to":"grams"},
{"from":"gravelled", "to":"graveled"},
{"from":"grey", "to":"gray"},
{"from":"greyed", "to":"grayed"},
{"from":"greying", "to":"graying"},
{"from":"greyish", "to":"grayish"},
{"from":"greyness", "to":"grayness"},
{"from":"greys", "to":"grays"},
{"from":"grovelled", "to":"groveled"},
{"from":"grovelling", "to":"groveling"},
{"from":"groyne", "to":"groin"},
{"from":"groynes", "to":"groins"},
{"from":"gruelling", "to":"grueling"},
{"from":"gruellingly", "to":"gruelingly"},
{"from":"gryphon", "to":"griffin"},
{"from":"gryphons", "to":"griffins"},
{"from":"gynaecological", "to":"gynecological"},
{"from":"gynaecologist", "to":"gynecologist"},
{"from":"gynaecologists", "to":"gynecologists"},
{"from":"gynaecology", "to":"gynecology"},
{"from":"haematological", "to":"hematological"},
{"from":"haematologist", "to":"hematologist"},
{"from":"haematologists", "to":"hematologists"},
{"from":"haematology", "to":"hematology"},
{"from":"haemoglobin", "to":"hemoglobin"},
{"from":"haemophilia", "to":"hemophilia"},
{"from":"haemophiliac", "to":"hemophiliac"},
{"from":"haemophiliacs", "to":"hemophiliacs"},
{"from":"haemorrhage", "to":"hemorrhage"},
{"from":"haemorrhaged", "to":"hemorrhaged"},
{"from":"haemorrhages", "to":"hemorrhages"},
{"from":"haemorrhaging", "to":"hemorrhaging"},
{"from":"haemorrhoids", "to":"hemorrhoids"},
{"from":"harbour", "to":"harbor"},
{"from":"harboured", "to":"harbored"},
{"from":"harbouring", "to":"harboring"},
{"from":"harbours", "to":"harbors"},
{"from":"harmonisation", "to":"harmonization"},
{"from":"harmonise", "to":"harmonize"},
{"from":"harmonised", "to":"harmonized"},
{"from":"harmonises", "to":"harmonizes"},
{"from":"harmonising", "to":"harmonizing"},
{"from":"homoeopath", "to":"homeopath"},
{"from":"homoeopathic", "to":"homeopathic"},
{"from":"homoeopaths", "to":"homeopaths"},
{"from":"homoeopathy", "to":"homeopathy"},
{"from":"homogenise", "to":"homogenize"},
{"from":"homogenised", "to":"homogenized"},
{"from":"homogenises", "to":"homogenizes"},
{"from":"homogenising", "to":"homogenizing"},
{"from":"honour", "to":"honor"},
{"from":"honourable", "to":"honorable"},
{"from":"honourably", "to":"honorably"},
{"from":"honoured", "to":"honored"},
{"from":"honouring", "to":"honoring"},
{"from":"honours", "to":"honors"},
{"from":"hospitalisation", "to":"hospitalization"},
{"from":"hospitalise", "to":"hospitalize"},
{"from":"hospitalised", "to":"hospitalized"},
{"from":"hospitalises", "to":"hospitalizes"},
{"from":"hospitalising", "to":"hospitalizing"},
{"from":"humanise", "to":"humanize"},
{"from":"humanised", "to":"humanized"},
{"from":"humanises", "to":"humanizes"},
{"from":"humanising", "to":"humanizing"},
{"from":"humour", "to":"humor"},
{"from":"humoured", "to":"humored"},
{"from":"humouring", "to":"humoring"},
{"from":"humourless", "to":"humorless"},
{"from":"humours", "to":"humors"},
{"from":"hybridise", "to":"hybridize"},
{"from":"hybridised", "to":"hybridized"},
{"from":"hybridises", "to":"hybridizes"},
{"from":"hybridising", "to":"hybridizing"},
{"from":"hypnotise", "to":"hypnotize"},
{"from":"hypnotised", "to":"hypnotized"},
{"from":"hypnotises", "to":"hypnotizes"},
{"from":"hypnotising", "to":"hypnotizing"},
{"from":"hypothesise", "to":"hypothesize"},
{"from":"hypothesised", "to":"hypothesized"},
{"from":"hypothesises", "to":"hypothesizes"},
{"from":"hypothesising", "to":"hypothesizing"},
{"from":"idealisation", "to":"idealization"},
{"from":"idealise", "to":"idealize"},
{"from":"idealised", "to":"idealized"},
{"from":"idealises", "to":"idealizes"},
{"from":"idealising", "to":"idealizing"},
{"from":"idolise", "to":"idolize"},
{"from":"idolised", "to":"idolized"},
{"from":"idolises", "to":"idolizes"},
{"from":"idolising", "to":"idolizing"},
{"from":"immobilisation", "to":"immobilization"},
{"from":"immobilise", "to":"immobilize"},
{"from":"immobilised", "to":"immobilized"},
{"from":"immobiliser", "to":"immobilizer"},
{"from":"immobilisers", "to":"immobilizers"},
{"from":"immobilises", "to":"immobilizes"},
{"from":"immobilising", "to":"immobilizing"},
{"from":"immortalise", "to":"immortalize"},
{"from":"immortalised", "to":"immortalized"},
{"from":"immortalises", "to":"immortalizes"},
{"from":"immortalising", "to":"immortalizing"},
{"from":"immunisation", "to":"immunization"},
{"from":"immunise", "to":"immunize"},
{"from":"immunised", "to":"immunized"},
{"from":"immunises", "to":"immunizes"},
{"from":"immunising", "to":"immunizing"},
{"from":"impanelled", "to":"impaneled"},
{"from":"impanelling", "to":"impaneling"},
{"from":"imperilled", "to":"imperiled"},
{"from":"imperilling", "to":"imperiling"},
{"from":"individualise", "to":"individualize"},
{"from":"individualised", "to":"individualized"},
{"from":"individualises", "to":"individualizes"},
{"from":"individualising", "to":"individualizing"},
{"from":"industrialise", "to":"industrialize"},
{"from":"industrialised", "to":"industrialized"},
{"from":"industrialises", "to":"industrializes"},
{"from":"industrialising", "to":"industrializing"},
{"from":"inflexion", "to":"inflection"},
{"from":"inflexions", "to":"inflections"},
{"from":"initialise", "to":"initialize"},
{"from":"initialised", "to":"initialized"},
{"from":"initialises", "to":"initializes"},
{"from":"initialising", "to":"initializing"},
{"from":"initialled", "to":"initialed"},
{"from":"initialling", "to":"initialing"},
{"from":"instal", "to":"install"},
{"from":"instalment", "to":"installment"},
{"from":"instalments", "to":"installments"},
{"from":"instals", "to":"installs"},
{"from":"instil", "to":"instill"},
{"from":"instils", "to":"instills"},
{"from":"institutionalisation", "to":"institutionalization"},
{"from":"institutionalise", "to":"institutionalize"},
{"from":"institutionalised", "to":"institutionalized"},
{"from":"institutionalises", "to":"institutionalizes"},
{"from":"institutionalising", "to":"institutionalizing"},
{"from":"intellectualise", "to":"intellectualize"},
{"from":"intellectualised", "to":"intellectualized"},
{"from":"intellectualises", "to":"intellectualizes"},
{"from":"intellectualising", "to":"intellectualizing"},
{"from":"internalisation", "to":"internalization"},
{"from":"internalise", "to":"internalize"},
{"from":"internalised", "to":"internalized"},
{"from":"internalises", "to":"internalizes"},
{"from":"internalising", "to":"internalizing"},
{"from":"internationalisation", "to":"internationalization"},
{"from":"internationalise", "to":"internationalize"},
{"from":"internationalised", "to":"internationalized"},
{"from":"internationalises", "to":"internationalizes"},
{"from":"internationalising", "to":"internationalizing"},
{"from":"ionisation", "to":"ionization"},
{"from":"ionise", "to":"ionize"},
{"from":"ionised", "to":"ionized"},
{"from":"ioniser", "to":"ionizer"},
{"from":"ionisers", "to":"ionizers"},
{"from":"ionises", "to":"ionizes"},
{"from":"ionising", "to":"ionizing"},
{"from":"italicise", "to":"italicize"},
{"from":"italicised", "to":"italicized"},
{"from":"italicises", "to":"italicizes"},
{"from":"italicising", "to":"italicizing"},
{"from":"itemise", "to":"itemize"},
{"from":"itemised", "to":"itemized"},
{"from":"itemises", "to":"itemizes"},
{"from":"itemising", "to":"itemizing"},
{"from":"jeopardise", "to":"jeopardize"},
{"from":"jeopardised", "to":"jeopardized"},
{"from":"jeopardises", "to":"jeopardizes"},
{"from":"jeopardising", "to":"jeopardizing"},
{"from":"jewelled", "to":"jeweled"},
{"from":"jeweller", "to":"jeweler"},
{"from":"jewellers", "to":"jewelers"},
{"from":"jewellery", "to":"jewelry"},
{"from":"judgement", "to":"judgment"},
{"from":"kilogramme", "to":"kilogram"},
{"from":"kilogrammes", "to":"kilograms"},
{"from":"kilometre", "to":"kilometer"},
{"from":"kilometres", "to":"kilometers"},
{"from":"labelled", "to":"labeled"},
{"from":"labelling", "to":"labeling"},
{"from":"labour", "to":"labor"},
{"from":"laboured", "to":"labored"},
{"from":"labourer", "to":"laborer"},
{"from":"labourers", "to":"laborers"},
{"from":"labouring", "to":"laboring"},
{"from":"labours", "to":"labors"},
{"from":"lacklustre", "to":"lackluster"},
{"from":"legalisation", "to":"legalization"},
{"from":"legalise", "to":"legalize"},
{"from":"legalised", "to":"legalized"},
{"from":"legalises", "to":"legalizes"},
{"from":"legalising", "to":"legalizing"},
{"from":"legitimise", "to":"legitimize"},
{"from":"legitimised", "to":"legitimized"},
{"from":"legitimises", "to":"legitimizes"},
{"from":"legitimising", "to":"legitimizing"},
{"from":"leukaemia", "to":"leukemia"},
{"from":"levelled", "to":"leveled"},
{"from":"leveller", "to":"leveler"},
{"from":"levellers", "to":"levelers"},
{"from":"levelling", "to":"leveling"},
{"from":"libelled", "to":"libeled"},
{"from":"libelling", "to":"libeling"},
{"from":"libellous", "to":"libelous"},
{"from":"liberalisation", "to":"liberalization"},
{"from":"liberalise", "to":"liberalize"},
{"from":"liberalised", "to":"liberalized"},
{"from":"liberalises", "to":"liberalizes"},
{"from":"liberalising", "to":"liberalizing"},
{"from":"licence", "to":"license"},
{"from":"licenced", "to":"licensed"},
{"from":"licences", "to":"licenses"},
{"from":"licencing", "to":"licensing"},
{"from":"likeable", "to":"likable"},
{"from":"lionisation", "to":"lionization"},
{"from":"lionise", "to":"lionize"},
{"from":"lionised", "to":"lionized"},
{"from":"lionises", "to":"lionizes"},
{"from":"lionising", "to":"lionizing"},
{"from":"liquidise", "to":"liquidize"},
{"from":"liquidised", "to":"liquidized"},
{"from":"liquidiser", "to":"liquidizer"},
{"from":"liquidisers", "to":"liquidizers"},
{"from":"liquidises", "to":"liquidizes"},
{"from":"liquidising", "to":"liquidizing"},
{"from":"litre", "to":"liter"},
{"from":"litres", "to":"liters"},
{"from":"localise", "to":"localize"},
{"from":"localised", "to":"localized"},
{"from":"localises", "to":"localizes"},
{"from":"localising", "to":"localizing"},
{"from":"louvre", "to":"louver"},
{"from":"louvred", "to":"louvered"},
{"from":"louvres", "to":"louvers"},
{"from":"lustre", "to":"luster"},
{"from":"magnetise", "to":"magnetize"},
{"from":"magnetised", "to":"magnetized"},
{"from":"magnetises", "to":"magnetizes"},
{"from":"magnetising", "to":"magnetizing"},
{"from":"manoeuvrability", "to":"maneuverability"},
{"from":"manoeuvrable", "to":"maneuverable"},
{"from":"manoeuvre", "to":"maneuver"},
{"from":"manoeuvred", "to":"maneuvered"},
{"from":"manoeuvres", "to":"maneuvers"},
{"from":"manoeuvring", "to":"maneuvering"},
{"from":"manoeuvrings", "to":"maneuverings"},
{"from":"marginalisation", "to":"marginalization"},
{"from":"marginalise", "to":"marginalize"},
{"from":"marginalised", "to":"marginalized"},
{"from":"marginalises", "to":"marginalizes"},
{"from":"marginalising", "to":"marginalizing"},
{"from":"marshalled", "to":"marshaled"},
{"from":"marshalling", "to":"marshaling"},
{"from":"marvelled", "to":"marveled"},
{"from":"marvelling", "to":"marveling"},
{"from":"marvellous", "to":"marvelous"},
{"from":"marvellously", "to":"marvelously"},
{"from":"materialisation", "to":"materialization"},
{"from":"materialise", "to":"materialize"},
{"from":"materialised", "to":"materialized"},
{"from":"materialises", "to":"materializes"},
{"from":"materialising", "to":"materializing"},
{"from":"maximisation", "to":"maximization"},
{"from":"maximise", "to":"maximize"},
{"from":"maximised", "to":"maximized"},
{"from":"maximises", "to":"maximizes"},
{"from":"maximising", "to":"maximizing"},
{"from":"meagre", "to":"meager"},
{"from":"mechanisation", "to":"mechanization"},
{"from":"mechanise", "to":"mechanize"},
{"from":"mechanised", "to":"mechanized"},
{"from":"mechanises", "to":"mechanizes"},
{"from":"mechanising", "to":"mechanizing"},
{"from":"mediaeval", "to":"medieval"},
{"from":"memorialise", "to":"memorialize"},
{"from":"memorialised", "to":"memorialized"},
{"from":"memorialises", "to":"memorializes"},
{"from":"memorialising", "to":"memorializing"},
{"from":"memorise", "to":"memorize"},
{"from":"memorised", "to":"memorized"},
{"from":"memorises", "to":"memorizes"},
{"from":"memorising", "to":"memorizing"},
{"from":"mesmerise", "to":"mesmerize"},
{"from":"mesmerised", "to":"mesmerized"},
{"from":"mesmerises", "to":"mesmerizes"},
{"from":"mesmerising", "to":"mesmerizing"},
{"from":"metabolise", "to":"metabolize"},
{"from":"metabolised", "to":"metabolized"},
{"from":"metabolises", "to":"metabolizes"},
{"from":"metabolising", "to":"metabolizing"},
{"from":"metre", "to":"meter"},
{"from":"metres", "to":"meters"},
{"from":"micrometre", "to":"micrometer"},
{"from":"micrometres", "to":"micrometers"},
{"from":"militarise", "to":"militarize"},
{"from":"militarised", "to":"militarized"},
{"from":"militarises", "to":"militarizes"},
{"from":"militarising", "to":"militarizing"},
{"from":"milligramme", "to":"milligram"},
{"from":"milligrammes", "to":"milligrams"},
{"from":"millilitre", "to":"milliliter"},
{"from":"millilitres", "to":"milliliters"},
{"from":"millimetre", "to":"millimeter"},
{"from":"millimetres", "to":"millimeters"},
{"from":"miniaturisation", "to":"miniaturization"},
{"from":"miniaturise", "to":"miniaturize"},
{"from":"miniaturised", "to":"miniaturized"},
{"from":"miniaturises", "to":"miniaturizes"},
{"from":"miniaturising", "to":"miniaturizing"},
{"from":"minibuses", "to":"minibusses"},
{"from":"minimise", "to":"minimize"},
{"from":"minimised", "to":"minimized"},
{"from":"minimises", "to":"minimizes"},
{"from":"minimising", "to":"minimizing"},
{"from":"misbehaviour", "to":"misbehavior"},
{"from":"misdemeanour", "to":"misdemeanor"},
{"from":"misdemeanours", "to":"misdemeanors"},
{"from":"misspelt", "to":"misspelled"},
{"from":"mitre", "to":"miter"},
{"from":"mitres", "to":"miters"},
{"from":"mobilisation", "to":"mobilization"},
{"from":"mobilise", "to":"mobilize"},
{"from":"mobilised", "to":"mobilized"},
{"from":"mobilises", "to":"mobilizes"},
{"from":"mobilising", "to":"mobilizing"},
{"from":"modelled", "to":"modeled"},
{"from":"modeller", "to":"modeler"},
{"from":"modellers", "to":"modelers"},
{"from":"modelling", "to":"modeling"},
{"from":"modernise", "to":"modernize"},
{"from":"modernised", "to":"modernized"},
{"from":"modernises", "to":"modernizes"},
{"from":"modernising", "to":"modernizing"},
{"from":"moisturise", "to":"moisturize"},
{"from":"moisturised", "to":"moisturized"},
{"from":"moisturiser", "to":"moisturizer"},
{"from":"moisturisers", "to":"moisturizers"},
{"from":"moisturises", "to":"moisturizes"},
{"from":"moisturising", "to":"moisturizing"},
{"from":"monologue", "to":"monolog"},
{"from":"monologues", "to":"monologs"},
{"from":"monopolisation", "to":"monopolization"},
{"from":"monopolise", "to":"monopolize"},
{"from":"monopolised", "to":"monopolized"},
{"from":"monopolises", "to":"monopolizes"},
{"from":"monopolising", "to":"monopolizing"},
{"from":"moralise", "to":"moralize"},
{"from":"moralised", "to":"moralized"},
{"from":"moralises", "to":"moralizes"},
{"from":"moralising", "to":"moralizing"},
{"from":"motorised", "to":"motorized"},
{"from":"mould", "to":"mold"},
{"from":"moulded", "to":"molded"},
{"from":"moulder", "to":"molder"},
{"from":"mouldered", "to":"moldered"},
{"from":"mouldering", "to":"moldering"},
{"from":"moulders", "to":"molders"},
{"from":"mouldier", "to":"moldier"},
{"from":"mouldiest", "to":"moldiest"},
{"from":"moulding", "to":"molding"},
{"from":"mouldings", "to":"moldings"},
{"from":"moulds", "to":"molds"},
{"from":"mouldy", "to":"moldy"},
{"from":"moult", "to":"molt"},
{"from":"moulted", "to":"molted"},
{"from":"moulting", "to":"molting"},
{"from":"moults", "to":"molts"},
{"from":"moustache", "to":"mustache"},
{"from":"moustached", "to":"mustached"},
{"from":"moustaches", "to":"mustaches"},
{"from":"moustachioed", "to":"mustachioed"},
{"from":"multicoloured", "to":"multicolored"},
{"from":"nationalisation", "to":"nationalization"},
{"from":"nationalisations", "to":"nationalizations"},
{"from":"nationalise", "to":"nationalize"},
{"from":"nationalised", "to":"nationalized"},
{"from":"nationalises", "to":"nationalizes"},
{"from":"nationalising", "to":"nationalizing"},
{"from":"naturalisation", "to":"naturalization"},
{"from":"naturalise", "to":"naturalize"},
{"from":"naturalised", "to":"naturalized"},
{"from":"naturalises", "to":"naturalizes"},
{"from":"naturalising", "to":"naturalizing"},
{"from":"neighbour", "to":"neighbor"},
{"from":"neighbourhood", "to":"neighborhood"},
{"from":"neighbourhoods", "to":"neighborhoods"},
{"from":"neighbouring", "to":"neighboring"},
{"from":"neighbourliness", "to":"neighborliness"},
{"from":"neighbourly", "to":"neighborly"},
{"from":"neighbours", "to":"neighbors"},
{"from":"neutralisation", "to":"neutralization"},
{"from":"neutralise", "to":"neutralize"},
{"from":"neutralised", "to":"neutralized"},
{"from":"neutralises", "to":"neutralizes"},
{"from":"neutralising", "to":"neutralizing"},
{"from":"normalisation", "to":"normalization"},
{"from":"normalise", "to":"normalize"},
{"from":"normalised", "to":"normalized"},
{"from":"normalises", "to":"normalizes"},
{"from":"normalising", "to":"normalizing"},
{"from":"odour", "to":"odor"},
{"from":"odourless", "to":"odorless"},
{"from":"odours", "to":"odors"},
{"from":"oesophagus", "to":"esophagus"},
{"from":"oesophaguses", "to":"esophaguses"},
{"from":"oestrogen", "to":"estrogen"},
{"from":"offence", "to":"offense"},
{"from":"offences", "to":"offenses"},
{"from":"omelette", "to":"omelet"},
{"from":"omelettes", "to":"omelets"},
{"from":"optimise", "to":"optimize"},
{"from":"optimised", "to":"optimized"},
{"from":"optimises", "to":"optimizes"},
{"from":"optimising", "to":"optimizing"},
{"from":"organisation", "to":"organization"},
{"from":"organisational", "to":"organizational"},
{"from":"organisations", "to":"organizations"},
{"from":"organise", "to":"organize"},
{"from":"organised", "to":"organized"},
{"from":"organiser", "to":"organizer"},
{"from":"organisers", "to":"organizers"},
{"from":"organises", "to":"organizes"},
{"from":"organising", "to":"organizing"},
{"from":"orthopaedic", "to":"orthopedic"},
{"from":"orthopaedics", "to":"orthopedics"},
{"from":"ostracise", "to":"ostracize"},
{"from":"ostracised", "to":"ostracized"},
{"from":"ostracises", "to":"ostracizes"},
{"from":"ostracising", "to":"ostracizing"},
{"from":"outmanoeuvre", "to":"outmaneuver"},
{"from":"outmanoeuvred", "to":"outmaneuvered"},
{"from":"outmanoeuvres", "to":"outmaneuvers"},
{"from":"outmanoeuvring", "to":"outmaneuvering"},
{"from":"overemphasise", "to":"overemphasize"},
{"from":"overemphasised", "to":"overemphasized"},
{"from":"overemphasises", "to":"overemphasizes"},
{"from":"overemphasising", "to":"overemphasizing"},
{"from":"oxidisation", "to":"oxidization"},
{"from":"oxidise", "to":"oxidize"},
{"from":"oxidised", "to":"oxidized"},
{"from":"oxidises", "to":"oxidizes"},
{"from":"oxidising", "to":"oxidizing"},
{"from":"paederast", "to":"pederast"},
{"from":"paederasts", "to":"pederasts"},
{"from":"paediatric", "to":"pediatric"},
{"from":"paediatrician", "to":"pediatrician"},
{"from":"paediatricians", "to":"pediatricians"},
{"from":"paediatrics", "to":"pediatrics"},
{"from":"paedophile", "to":"pedophile"},
{"from":"paedophiles", "to":"pedophiles"},
{"from":"paedophilia", "to":"pedophilia"},
{"from":"palaeolithic", "to":"paleolithic"},
{"from":"palaeontologist", "to":"paleontologist"},
{"from":"palaeontologists", "to":"paleontologists"},
{"from":"palaeontology", "to":"paleontology"},
{"from":"panelled", "to":"paneled"},
{"from":"panelling", "to":"paneling"},
{"from":"panellist", "to":"panelist"},
{"from":"panellists", "to":"panelists"},
{"from":"paralyse", "to":"paralyze"},
{"from":"paralysed", "to":"paralyzed"},
{"from":"paralyses", "to":"paralyzes"},
{"from":"paralysing", "to":"paralyzing"},
{"from":"parcelled", "to":"parceled"},
{"from":"parcelling", "to":"parceling"},
{"from":"parlour", "to":"parlor"},
{"from":"parlours", "to":"parlors"},
{"from":"particularise", "to":"particularize"},
{"from":"particularised", "to":"particularized"},
{"from":"particularises", "to":"particularizes"},
{"from":"particularising", "to":"particularizing"},
{"from":"passivisation", "to":"passivization"},
{"from":"passivise", "to":"passivize"},
{"from":"passivised", "to":"passivized"},
{"from":"passivises", "to":"passivizes"},
{"from":"passivising", "to":"passivizing"},
{"from":"pasteurisation", "to":"pasteurization"},
{"from":"pasteurise", "to":"pasteurize"},
{"from":"pasteurised", "to":"pasteurized"},
{"from":"pasteurises", "to":"pasteurizes"},
{"from":"pasteurising", "to":"pasteurizing"},
{"from":"patronise", "to":"patronize"},
{"from":"patronised", "to":"patronized"},
{"from":"patronises", "to":"patronizes"},
{"from":"patronising", "to":"patronizing"},
{"from":"patronisingly", "to":"patronizingly"},
{"from":"pedalled", "to":"pedaled"},
{"from":"pedalling", "to":"pedaling"},
{"from":"pedestrianisation", "to":"pedestrianization"},
{"from":"pedestrianise", "to":"pedestrianize"},
{"from":"pedestrianised", "to":"pedestrianized"},
{"from":"pedestrianises", "to":"pedestrianizes"},
{"from":"pedestrianising", "to":"pedestrianizing"},
{"from":"penalise", "to":"penalize"},
{"from":"penalised", "to":"penalized"},
{"from":"penalises", "to":"penalizes"},
{"from":"penalising", "to":"penalizing"},
{"from":"pencilled", "to":"penciled"},
{"from":"pencilling", "to":"penciling"},
{"from":"personalise", "to":"personalize"},
{"from":"personalised", "to":"personalized"},
{"from":"personalises", "to":"personalizes"},
{"from":"personalising", "to":"personalizing"},
{"from":"pharmacopoeia", "to":"pharmacopeia"},
{"from":"pharmacopoeias", "to":"pharmacopeias"},
{"from":"philosophise", "to":"philosophize"},
{"from":"philosophised", "to":"philosophized"},
{"from":"philosophises", "to":"philosophizes"},
{"from":"philosophising", "to":"philosophizing"},
{"from":"philtre", "to":"filter"},
{"from":"philtres", "to":"filters"},
{"from":"phoney", "to":"phony"},
{"from":"plagiarise", "to":"plagiarize"},
{"from":"plagiarised", "to":"plagiarized"},
{"from":"plagiarises", "to":"plagiarizes"},
{"from":"plagiarising", "to":"plagiarizing"},
{"from":"plough", "to":"plow"},
{"from":"ploughed", "to":"plowed"},
{"from":"ploughing", "to":"plowing"},
{"from":"ploughman", "to":"plowman"},
{"from":"ploughmen", "to":"plowmen"},
{"from":"ploughs", "to":"plows"},
{"from":"ploughshare", "to":"plowshare"},
{"from":"ploughshares", "to":"plowshares"},
{"from":"polarisation", "to":"polarization"},
{"from":"polarise", "to":"polarize"},
{"from":"polarised", "to":"polarized"},
{"from":"polarises", "to":"polarizes"},
{"from":"polarising", "to":"polarizing"},
{"from":"politicisation", "to":"politicization"},
{"from":"politicise", "to":"politicize"},
{"from":"politicised", "to":"politicized"},
{"from":"politicises", "to":"politicizes"},
{"from":"politicising", "to":"politicizing"},
{"from":"popularisation", "to":"popularization"},
{"from":"popularise", "to":"popularize"},
{"from":"popularised", "to":"popularized"},
{"from":"popularises", "to":"popularizes"},
{"from":"popularising", "to":"popularizing"},
{"from":"pouffe", "to":"pouf"},
{"from":"pouffes", "to":"poufs"},
{"from":"practise", "to":"practice"},
{"from":"practised", "to":"practiced"},
{"from":"practises", "to":"practices"},
{"from":"practising", "to":"practicing"},
{"from":"praesidium", "to":"presidium"},
{"from":"praesidiums", "to":"presidiums"},
{"from":"pressurisation", "to":"pressurization"},
{"from":"pressurise", "to":"pressurize"},
{"from":"pressurised", "to":"pressurized"},
{"from":"pressurises", "to":"pressurizes"},
{"from":"pressurising", "to":"pressurizing"},
{"from":"pretence", "to":"pretense"},
{"from":"pretences", "to":"pretenses"},
{"from":"primaeval", "to":"primeval"},
{"from":"prioritisation", "to":"prioritization"},
{"from":"prioritise", "to":"prioritize"},
{"from":"prioritised", "to":"prioritized"},
{"from":"prioritises", "to":"prioritizes"},
{"from":"prioritising", "to":"prioritizing"},
{"from":"privatisation", "to":"privatization"},
{"from":"privatisations", "to":"privatizations"},
{"from":"privatise", "to":"privatize"},
{"from":"privatised", "to":"privatized"},
{"from":"privatises", "to":"privatizes"},
{"from":"privatising", "to":"privatizing"},
{"from":"professionalisation", "to":"professionalization"},
{"from":"professionalise", "to":"professionalize"},
{"from":"professionalised", "to":"professionalized"},
{"from":"professionalises", "to":"professionalizes"},
{"from":"professionalising", "to":"professionalizing"},
{"from":"programme", "to":"program"},
{"from":"programmes", "to":"programs"},
{"from":"prologue", "to":"prolog"},
{"from":"prologues", "to":"prologs"},
{"from":"propagandise", "to":"propagandize"},
{"from":"propagandised", "to":"propagandized"},
{"from":"propagandises", "to":"propagandizes"},
{"from":"propagandising", "to":"propagandizing"},
{"from":"proselytise", "to":"proselytize"},
{"from":"proselytised", "to":"proselytized"},
{"from":"proselytiser", "to":"proselytizer"},
{"from":"proselytisers", "to":"proselytizers"},
{"from":"proselytises", "to":"proselytizes"},
{"from":"proselytising", "to":"proselytizing"},
{"from":"psychoanalyse", "to":"psychoanalyze"},
{"from":"psychoanalysed", "to":"psychoanalyzed"},
{"from":"psychoanalyses", "to":"psychoanalyzes"},
{"from":"psychoanalysing", "to":"psychoanalyzing"},
{"from":"publicise", "to":"publicize"},
{"from":"publicised", "to":"publicized"},
{"from":"publicises", "to":"publicizes"},
{"from":"publicising", "to":"publicizing"},
{"from":"pulverisation", "to":"pulverization"},
{"from":"pulverise", "to":"pulverize"},
{"from":"pulverised", "to":"pulverized"},
{"from":"pulverises", "to":"pulverizes"},
{"from":"pulverising", "to":"pulverizing"},
{"from":"pummelled", "to":"pummel"},
{"from":"pummelling", "to":"pummeled"},
{"from":"pyjama", "to":"pajama"},
{"from":"pyjamas", "to":"pajamas"},
{"from":"pzazz", "to":"pizzazz"},
{"from":"quarrelled", "to":"quarreled"},
{"from":"quarrelling", "to":"quarreling"},
{"from":"radicalise", "to":"radicalize"},
{"from":"radicalised", "to":"radicalized"},
{"from":"radicalises", "to":"radicalizes"},
{"from":"radicalising", "to":"radicalizing"},
{"from":"rancour", "to":"rancor"},
{"from":"randomise", "to":"randomize"},
{"from":"randomised", "to":"randomized"},
{"from":"randomises", "to":"randomizes"},
{"from":"randomising", "to":"randomizing"},
{"from":"rationalisation", "to":"rationalization"},
{"from":"rationalisations", "to":"rationalizations"},
{"from":"rationalise", "to":"rationalize"},
{"from":"rationalised", "to":"rationalized"},
{"from":"rationalises", "to":"rationalizes"},
{"from":"rationalising", "to":"rationalizing"},
{"from":"ravelled", "to":"raveled"},
{"from":"ravelling", "to":"raveling"},
{"from":"realisable", "to":"realizable"},
{"from":"realisation", "to":"realization"},
{"from":"realisations", "to":"realizations"},
{"from":"realise", "to":"realize"},
{"from":"realised", "to":"realized"},
{"from":"realises", "to":"realizes"},
{"from":"realising", "to":"realizing"},
{"from":"recognisable", "to":"recognizable"},
{"from":"recognisably", "to":"recognizably"},
{"from":"recognisance", "to":"recognizance"},
{"from":"recognise", "to":"recognize"},
{"from":"recognised", "to":"recognized"},
{"from":"recognises", "to":"recognizes"},
{"from":"recognising", "to":"recognizing"},
{"from":"reconnoitre", "to":"reconnoiter"},
{"from":"reconnoitred", "to":"reconnoitered"},
{"from":"reconnoitres", "to":"reconnoiters"},
{"from":"reconnoitring", "to":"reconnoitering"},
{"from":"refuelled", "to":"refueled"},
{"from":"refuelling", "to":"refueling"},
{"from":"regularisation", "to":"regularization"},
{"from":"regularise", "to":"regularize"},
{"from":"regularised", "to":"regularized"},
{"from":"regularises", "to":"regularizes"},
{"from":"regularising", "to":"regularizing"},
{"from":"remodelled", "to":"remodeled"},
{"from":"remodelling", "to":"remodeling"},
{"from":"remould", "to":"remold"},
{"from":"remoulded", "to":"remolded"},
{"from":"remoulding", "to":"remolding"},
{"from":"remoulds", "to":"remolds"},
{"from":"reorganisation", "to":"reorganization"},
{"from":"reorganisations", "to":"reorganizations"},
{"from":"reorganise", "to":"reorganize"},
{"from":"reorganised", "to":"reorganized"},
{"from":"reorganises", "to":"reorganizes"},
{"from":"reorganising", "to":"reorganizing"},
{"from":"revelled", "to":"reveled"},
{"from":"reveller", "to":"reveler"},
{"from":"revellers", "to":"revelers"},
{"from":"revelling", "to":"reveling"},
{"from":"revitalise", "to":"revitalize"},
{"from":"revitalised", "to":"revitalized"},
{"from":"revitalises", "to":"revitalizes"},
{"from":"revitalising", "to":"revitalizing"},
{"from":"revolutionise", "to":"revolutionize"},
{"from":"revolutionised", "to":"revolutionized"},
{"from":"revolutionises", "to":"revolutionizes"},
{"from":"revolutionising", "to":"revolutionizing"},
{"from":"rhapsodise", "to":"rhapsodize"},
{"from":"rhapsodised", "to":"rhapsodized"},
{"from":"rhapsodises", "to":"rhapsodizes"},
{"from":"rhapsodising", "to":"rhapsodizing"},
{"from":"rigour", "to":"rigor"},
{"from":"rigours", "to":"rigors"},
{"from":"ritualised", "to":"ritualized"},
{"from":"rivalled", "to":"rivaled"},
{"from":"rivalling", "to":"rivaling"},
{"from":"romanticise", "to":"romanticize"},
{"from":"romanticised", "to":"romanticized"},
{"from":"romanticises", "to":"romanticizes"},
{"from":"romanticising", "to":"romanticizing"},
{"from":"rumour", "to":"rumor"},
{"from":"rumoured", "to":"rumored"},
{"from":"rumours", "to":"rumors"},
{"from":"sabre", "to":"saber"},
{"from":"sabres", "to":"sabers"},
{"from":"saltpetre", "to":"saltpeter"},
{"from":"sanitise", "to":"sanitize"},
{"from":"sanitised", "to":"sanitized"},
{"from":"sanitises", "to":"sanitizes"},
{"from":"sanitising", "to":"sanitizing"},
{"from":"satirise", "to":"satirize"},
{"from":"satirised", "to":"satirized"},
{"from":"satirises", "to":"satirizes"},
{"from":"satirising", "to":"satirizing"},
{"from":"saviour", "to":"savior"},
{"from":"saviours", "to":"saviors"},
{"from":"savour", "to":"savor"},
{"from":"savoured", "to":"savored"},
{"from":"savouries", "to":"savories"},
{"from":"savouring", "to":"savoring"},
{"from":"savours", "to":"savors"},
{"from":"savoury", "to":"savory"},
{"from":"scandalise", "to":"scandalize"},
{"from":"scandalised", "to":"scandalized"},
{"from":"scandalises", "to":"scandalizes"},
{"from":"scandalising", "to":"scandalizing"},
{"from":"sceptic", "to":"skeptic"},
{"from":"sceptical", "to":"skeptical"},
{"from":"sceptically", "to":"skeptically"},
{"from":"scepticism", "to":"skepticism"},
{"from":"sceptics", "to":"skeptics"},
{"from":"sceptre", "to":"scepter"},
{"from":"sceptres", "to":"scepters"},
{"from":"scrutinise", "to":"scrutinize"},
{"from":"scrutinised", "to":"scrutinized"},
{"from":"scrutinises", "to":"scrutinizes"},
{"from":"scrutinising", "to":"scrutinizing"},
{"from":"secularisation", "to":"secularization"},
{"from":"secularise", "to":"secularize"},
{"from":"secularised", "to":"secularized"},
{"from":"secularises", "to":"secularizes"},
{"from":"secularising", "to":"secularizing"},
{"from":"sensationalise", "to":"sensationalize"},
{"from":"sensationalised", "to":"sensationalized"},
{"from":"sensationalises", "to":"sensationalizes"},
{"from":"sensationalising", "to":"sensationalizing"},
{"from":"sensitise", "to":"sensitize"},
{"from":"sensitised", "to":"sensitized"},
{"from":"sensitises", "to":"sensitizes"},
{"from":"sensitising", "to":"sensitizing"},
{"from":"sentimentalise", "to":"sentimentalize"},
{"from":"sentimentalised", "to":"sentimentalized"},
{"from":"sentimentalises", "to":"sentimentalizes"},
{"from":"sentimentalising", "to":"sentimentalizing"},
{"from":"sepulchre", "to":"sepulcher"},
{"from":"sepulchres", "to":"sepulchers"},
{"from":"serialisation", "to":"serialization"},
{"from":"serialisations", "to":"serializations"},
{"from":"serialise", "to":"serialize"},
{"from":"serialised", "to":"serialized"},
{"from":"serialises", "to":"serializes"},
{"from":"serialising", "to":"serializing"},
{"from":"sermonise", "to":"sermonize"},
{"from":"sermonised", "to":"sermonized"},
{"from":"sermonises", "to":"sermonizes"},
{"from":"sermonising", "to":"sermonizing"},
{"from":"sheikh", "to":"sheik"},
{"from":"shovelled", "to":"shoveled"},
{"from":"shovelling", "to":"shoveling"},
{"from":"shrivelled", "to":"shriveled"},
{"from":"shrivelling", "to":"shriveling"},
{"from":"signalise", "to":"signalize"},
{"from":"signalised", "to":"signalized"},
{"from":"signalises", "to":"signalizes"},
{"from":"signalising", "to":"signalizing"},
{"from":"signalled", "to":"signaled"},
{"from":"signalling", "to":"signaling"},
{"from":"smoulder", "to":"smolder"},
{"from":"smouldered", "to":"smoldered"},
{"from":"smouldering", "to":"smoldering"},
{"from":"smoulders", "to":"smolders"},
{"from":"snivelled", "to":"sniveled"},
{"from":"snivelling", "to":"sniveling"},
{"from":"snorkelled", "to":"snorkeled"},
{"from":"snorkelling", "to":"snorkeling"},
{"from":"snowplough", "to":"snowplow"},
{"from":"snowploughs", "to":"snowplow"},
{"from":"socialisation", "to":"socialization"},
{"from":"socialise", "to":"socialize"},
{"from":"socialised", "to":"socialized"},
{"from":"socialises", "to":"socializes"},
{"from":"socialising", "to":"socializing"},
{"from":"sodomise", "to":"sodomize"},
{"from":"sodomised", "to":"sodomized"},
{"from":"sodomises", "to":"sodomizes"},
{"from":"sodomising", "to":"sodomizing"},
{"from":"solemnise", "to":"solemnize"},
{"from":"solemnised", "to":"solemnized"},
{"from":"solemnises", "to":"solemnizes"},
{"from":"solemnising", "to":"solemnizing"},
{"from":"sombre", "to":"somber"},
{"from":"specialisation", "to":"specialization"},
{"from":"specialisations", "to":"specializations"},
{"from":"specialise", "to":"specialize"},
{"from":"specialised", "to":"specialized"},
{"from":"specialises", "to":"specializes"},
{"from":"specialising", "to":"specializing"},
{"from":"spectre", "to":"specter"},
{"from":"spectres", "to":"specters"},
{"from":"spiralled", "to":"spiraled"},
{"from":"spiralling", "to":"spiraling"},
{"from":"splendour", "to":"splendor"},
{"from":"splendours", "to":"splendors"},
{"from":"squirrelled", "to":"squirreled"},
{"from":"squirrelling", "to":"squirreling"},
{"from":"stabilisation", "to":"stabilization"},
{"from":"stabilise", "to":"stabilize"},
{"from":"stabilised", "to":"stabilized"},
{"from":"stabiliser", "to":"stabilizer"},
{"from":"stabilisers", "to":"stabilizers"},
{"from":"stabilises", "to":"stabilizes"},
{"from":"stabilising", "to":"stabilizing"},
{"from":"standardisation", "to":"standardization"},
{"from":"standardise", "to":"standardize"},
{"from":"standardised", "to":"standardized"},
{"from":"standardises", "to":"standardizes"},
{"from":"standardising", "to":"standardizing"},
{"from":"stencilled", "to":"stenciled"},
{"from":"stencilling", "to":"stenciling"},
{"from":"sterilisation", "to":"sterilization"},
{"from":"sterilisations", "to":"sterilizations"},
{"from":"sterilise", "to":"sterilize"},
{"from":"sterilised", "to":"sterilized"},
{"from":"steriliser", "to":"sterilizer"},
{"from":"sterilisers", "to":"sterilizers"},
{"from":"sterilises", "to":"sterilizes"},
{"from":"sterilising", "to":"sterilizing"},
{"from":"stigmatisation", "to":"stigmatization"},
{"from":"stigmatise", "to":"stigmatize"},
{"from":"stigmatised", "to":"stigmatized"},
{"from":"stigmatises", "to":"stigmatizes"},
{"from":"stigmatising", "to":"stigmatizing"},
{"from":"storey", "to":"story"},
{"from":"storeys", "to":"stories"},
{"from":"subsidisation", "to":"subsidization"},
{"from":"subsidise", "to":"subsidize"},
{"from":"subsidised", "to":"subsidized"},
{"from":"subsidiser", "to":"subsidizer"},
{"from":"subsidisers", "to":"subsidizers"},
{"from":"subsidises", "to":"subsidizes"},
{"from":"subsidising", "to":"subsidizing"},
{"from":"succour", "to":"succor"},
{"from":"succoured", "to":"succored"},
{"from":"succouring", "to":"succoring"},
{"from":"succours", "to":"succors"},
{"from":"sulphate", "to":"sulfate"},
{"from":"sulphates", "to":"sulfates"},
{"from":"sulphide", "to":"sulfide"},
{"from":"sulphides", "to":"sulfides"},
{"from":"sulphur", "to":"sulfur"},
{"from":"sulphurous", "to":"sulfurous"},
{"from":"summarise", "to":"summarize"},
{"from":"summarised", "to":"summarized"},
{"from":"summarises", "to":"summarizes"},
{"from":"summarising", "to":"summarizing"},
{"from":"swivelled", "to":"swiveled"},
{"from":"swivelling", "to":"swiveling"},
{"from":"symbolise", "to":"symbolize"},
{"from":"symbolised", "to":"symbolized"},
{"from":"symbolises", "to":"symbolizes"},
{"from":"symbolising", "to":"symbolizing"},
{"from":"sympathise", "to":"sympathize"},
{"from":"sympathised", "to":"sympathized"},
{"from":"sympathiser", "to":"sympathizer"},
{"from":"sympathisers", "to":"sympathizers"},
{"from":"sympathises", "to":"sympathizes"},
{"from":"sympathising", "to":"sympathizing"},
{"from":"synchronisation", "to":"synchronization"},
{"from":"synchronise", "to":"synchronize"},
{"from":"synchronised", "to":"synchronized"},
{"from":"synchronises", "to":"synchronizes"},
{"from":"synchronising", "to":"synchronizing"},
{"from":"synthesise", "to":"synthesize"},
{"from":"synthesised", "to":"synthesized"},
{"from":"synthesiser", "to":"synthesizer"},
{"from":"synthesisers", "to":"synthesizers"},
{"from":"synthesises", "to":"synthesizes"},
{"from":"synthesising", "to":"synthesizing"},
{"from":"syphon", "to":"siphon"},
{"from":"syphoned", "to":"siphoned"},
{"from":"syphoning", "to":"siphoning"},
{"from":"syphons", "to":"siphons"},
{"from":"systematisation", "to":"systematization"},
{"from":"systematise", "to":"systematize"},
{"from":"systematised", "to":"systematized"},
{"from":"systematises", "to":"systematizes"},
{"from":"systematising", "to":"systematizing"},
{"from":"tantalise", "to":"tantalize"},
{"from":"tantalised", "to":"tantalized"},
{"from":"tantalises", "to":"tantalizes"},
{"from":"tantalising", "to":"tantalizing"},
{"from":"tantalisingly", "to":"tantalizingly"},
{"from":"tasselled", "to":"tasseled"},
{"from":"technicolour", "to":"technicolor"},
{"from":"temporise", "to":"temporize"},
{"from":"temporised", "to":"temporized"},
{"from":"temporises", "to":"temporizes"},
{"from":"temporising", "to":"temporizing"},
{"from":"tenderise", "to":"tenderize"},
{"from":"tenderised", "to":"tenderized"},
{"from":"tenderises", "to":"tenderizes"},
{"from":"tenderising", "to":"tenderizing"},
{"from":"terrorise", "to":"terrorize"},
{"from":"terrorised", "to":"terrorized"},
{"from":"terrorises", "to":"terrorizes"},
{"from":"terrorising", "to":"terrorizing"},
{"from":"theatre", "to":"theater"},
{"from":"theatregoer", "to":"theatergoer"},
{"from":"theatregoers", "to":"theatergoers"},
{"from":"theatres", "to":"theaters"},
{"from":"theorise", "to":"theorize"},
{"from":"theorised", "to":"theorized"},
{"from":"theorises", "to":"theorizes"},
{"from":"theorising", "to":"theorizing"},
{"from":"tonne", "to":"ton"},
{"from":"tonnes", "to":"tons"},
{"from":"towelled", "to":"toweled"},
{"from":"towelling", "to":"toweling"},
{"from":"toxaemia", "to":"toxemia"},
{"from":"tranquillise", "to":"tranquilize"},
{"from":"tranquillised", "to":"tranquilized"},
{"from":"tranquilliser", "to":"tranquilizer"},
{"from":"tranquillisers", "to":"tranquilizers"},
{"from":"tranquillises", "to":"tranquilizes"},
{"from":"tranquillising", "to":"tranquilizing"},
{"from":"tranquillity", "to":"tranquility"},
{"from":"tranquillize", "to":"tranquilize"},
{"from":"tranquillized", "to":"tranquilized"},
{"from":"tranquillizer", "to":"tranquilizer"},
{"from":"tranquillizers", "to":"tranquilizers"},
{"from":"tranquillizes", "to":"tranquilizes"},
{"from":"tranquillizing", "to":"tranquilizing"},
{"from":"tranquilly", "to":"tranquility"},
{"from":"transistorised", "to":"transistorized"},
{"from":"traumatise", "to":"traumatize"},
{"from":"traumatised", "to":"traumatized"},
{"from":"traumatises", "to":"traumatizes"},
{"from":"traumatising", "to":"traumatizing"},
{"from":"travelled", "to":"traveled"},
{"from":"traveller", "to":"traveler"},
{"from":"travellers", "to":"travelers"},
{"from":"travelling", "to":"traveling"},
{"from":"travelogue", "to":"travelog"},
{"from":"travelogues", "to":"travelogs"},
{"from":"trialled", "to":"trialed"},
{"from":"trialling", "to":"trialing"},
{"from":"tricolour", "to":"tricolor"},
{"from":"tricolours", "to":"tricolors"},
{"from":"trivialise", "to":"trivialize"},
{"from":"trivialised", "to":"trivialized"},
{"from":"trivialises", "to":"trivializes"},
{"from":"trivialising", "to":"trivializing"},
{"from":"tumour", "to":"tumor"},
{"from":"tumours", "to":"tumors"},
{"from":"tunnelled", "to":"tunneled"},
{"from":"tunnelling", "to":"tunneling"},
{"from":"tyrannise", "to":"tyrannize"},
{"from":"tyrannised", "to":"tyrannized"},
{"from":"tyrannises", "to":"tyrannizes"},
{"from":"tyrannising", "to":"tyrannizing"},
{"from":"tyre", "to":"tire"},
{"from":"tyres", "to":"tires"},
{"from":"unauthorised", "to":"unauthorized"},
{"from":"uncivilised", "to":"uncivilized"},
{"from":"underutilised", "to":"underutilized"},
{"from":"unequalled", "to":"unequaled"},
{"from":"unfavourable", "to":"unfavorable"},
{"from":"unfavourably", "to":"unfavorably"},
{"from":"unionisation", "to":"unionization"},
{"from":"unionise", "to":"unionize"},
{"from":"unionised", "to":"unionized"},
{"from":"unionises", "to":"unionizes"},
{"from":"unionising", "to":"unionizing"},
{"from":"unorganised", "to":"unorganized"},
{"from":"unravelled", "to":"unraveled"},
{"from":"unravelling", "to":"unraveling"},
{"from":"unrecognisable", "to":"unrecognizable"},
{"from":"unrecognised", "to":"unrecognized"},
{"from":"unrivalled", "to":"unrivaled"},
{"from":"unsavoury", "to":"unsavory"},
{"from":"untrammelled", "to":"untrammeled"},
{"from":"urbanisation", "to":"urbanization"},
{"from":"urbanise", "to":"urbanize"},
{"from":"urbanised", "to":"urbanized"},
{"from":"urbanises", "to":"urbanizes"},
{"from":"urbanising", "to":"urbanizing"},
{"from":"utilisable", "to":"utilizable"},
{"from":"utilisation", "to":"utilization"},
{"from":"utilise", "to":"utilize"},
{"from":"utilised", "to":"utilized"},
{"from":"utilises", "to":"utilizes"},
{"from":"utilising", "to":"utilizing"},
{"from":"valour", "to":"valor"},
{"from":"vandalise", "to":"vandalize"},
{"from":"vandalised", "to":"vandalized"},
{"from":"vandalises", "to":"vandalizes"},
{"from":"vandalising", "to":"vandalizing"},
{"from":"vaporisation", "to":"vaporization"},
{"from":"vaporise", "to":"vaporize"},
{"from":"vaporised", "to":"vaporized"},
{"from":"vaporises", "to":"vaporizes"},
{"from":"vaporising", "to":"vaporizing"},
{"from":"vapour", "to":"vapor"},
{"from":"vapours", "to":"vapors"},
{"from":"verbalise", "to":"verbalize"},
{"from":"verbalised", "to":"verbalized"},
{"from":"verbalises", "to":"verbalizes"},
{"from":"verbalising", "to":"verbalizing"},
{"from":"victimisation", "to":"victimization"},
{"from":"victimise", "to":"victimize"},
{"from":"victimised", "to":"victimized"},
{"from":"victimises", "to":"victimizes"},
{"from":"victimising", "to":"victimizing"},
{"from":"videodisc", "to":"videodisk"},
{"from":"videodiscs", "to":"videodisks"},
{"from":"vigour", "to":"vigor"},
{"from":"visualisation", "to":"visualization"},
{"from":"visualisations", "to":"visualizations"},
{"from":"visualise", "to":"visualize"},
{"from":"visualised", "to":"visualized"},
{"from":"visualises", "to":"visualizes"},
{"from":"visualising", "to":"visualizing"},
{"from":"vocalisation", "to":"vocalization"},
{"from":"vocalisations", "to":"vocalizations"},
{"from":"vocalise", "to":"vocalize"},
{"from":"vocalised", "to":"vocalized"},
{"from":"vocalises", "to":"vocalizes"},
{"from":"vocalising", "to":"vocalizing"},
{"from":"vulcanised", "to":"vulcanized"},
{"from":"vulgarisation", "to":"vulgarization"},
{"from":"vulgarise", "to":"vulgarize"},
{"from":"vulgarised", "to":"vulgarized"},
{"from":"vulgarises", "to":"vulgarizes"},
{"from":"vulgarising", "to":"vulgarizing"},
{"from":"waggon", "to":"wagon"},
{"from":"waggons", "to":"wagons"},
{"from":"watercolour", "to":"watercolor"},
{"from":"watercolours", "to":"watercolors"},
{"from":"weaselled", "to":"weaseled"},
{"from":"weaselling", "to":"weaseling"},
{"from":"westernisation", "to":"westernization"},
{"from":"westernise", "to":"westernize"},
{"from":"westernised", "to":"westernized"},
{"from":"westernises", "to":"westernizes"},
{"from":"westernising", "to":"westernizing"},
{"from":"womanise", "to":"womanize"},
{"from":"womanised", "to":"womanized"},
{"from":"womaniser", "to":"womanizer"},
{"from":"womanisers", "to":"womanizers"},
{"from":"womanises", "to":"womanizes"},
{"from":"womanising", "to":"womanizing"},
{"from":"woollen", "to":"woolen"},
{"from":"woollens", "to":"woolens"},
{"from":"woollies", "to":"woolies"},
{"from":"woolly", "to":"wooly"},
{"from":"worshipped", "to":"worshiped"},
{"from":"worshipping", "to":"worshiping"},
{"from":"worshipper", "to":"worshiper"},
{"from":"yodelled", "to":"yodeled"},
{"from":"yodelling", "to":"yodeling"},
{"from":"yoghourt", "to":"yogurt"},
{"from":"yoghourts", "to":"yogurts"},
{"from":"yoghurt", "to":"yogurt"},
{"from":"yoghurts", "to":"yogurts"}
],
"typos":[
{"misspelling":"accomodation","correct":"accommodation"},
{"misspelling":"acommodation","correct":"accommodation"},
{"misspelling":"acheive","correct":"achieve"},
{"misspelling":"accross","correct":"across"},
{"misspelling":"adress","correct":"address"},
{"misspelling":"agressive","correct":"aggressive"},
{"misspelling":"alot","correct":"a lot"},
{"misspelling":"apparantly","correct":"apparently"},
{"misspelling":"appearence","correct":"appearance"},
{"misspelling":"arguement","correct":"argument"},
{"misspelling":"assasination","correct":"assassination"},
{"misspelling":"basicly","correct":"basically"},
{"misspelling":"beggining","correct":"beginning"},
{"misspelling":"beleive","correct":"believe"},
{"misspelling":"bizzare","correct":"bizarre"},
{"misspelling":"buisness","correct":"business"},
{"misspelling":"carribean","correct":"caribbean"},
{"misspelling":"chauffer","correct":"chauffeur"},
{"misspelling":"cemetary","correct":"cemetery"},
{"misspelling":"collegue","correct":"colleague"},
{"misspelling":"commitee","correct":"committee"},
{"misspelling":"committment","correct":"commitment"},
{"misspelling":"completly","correct":"completely"},
{"misspelling":"concious","correct":"conscious"},
{"misspelling":"copywrite","correct":"copyright"},
{"misspelling":"curiousity","correct":"curiosity"},
{"misspelling":"decaffinated","correct":"decaffeinated"},
{"misspelling":"definately","correct":"definitely"},
{"misspelling":"dependance","correct":"dependence"},
{"misspelling":"desireable","correct":"desirable"},
{"misspelling":"diarhea","correct":"diarrhoea"},
{"misspelling":"dissapoint","correct":"disappoint"},
{"misspelling":"dissapear","correct":"disappear"},
{"misspelling":"dispell","correct":"dispel"},
{"misspelling":"ecstacy","correct":"ecstasy"},
{"misspelling":"embarass","correct":"embarrass"},
{"misspelling":"enviroment","correct":"environment"},
{"misspelling":"Farenheit","correct":"Fahrenheit"},
{"misspelling":"febuary","correct":"february"},
{"misspelling":"finaly","correct":"finally"},
{"misspelling":"fluoroscent","correct":"fluorescent"},
{"misspelling":"flouride","correct":"fluoride"},
{"misspelling":"foriegn","correct":"foreign"},
{"misspelling":"forteen","correct":"fourteen"},
{"misspelling":"fourty","correct":"forty"},
{"misspelling":"freind","correct":"friend"},
{"misspelling":"geneology","correct":"genealogy"},
{"misspelling":"glamourous","correct":"glamorous"},
{"misspelling":"goverment","correct":"government"},
{"misspelling":"grammer","correct":"grammar"},
{"misspelling":"happend","correct":"happened"},
{"misspelling":"hemorage","correct":"haemorrhage"},
{"misspelling":"heros","correct":"heroes"},
{"misspelling":"hight","correct":"height"},
{"misspelling":"humourous","correct":"humorous"},
{"misspelling":"hygeine","correct":"hygiene"},
{"misspelling":"idiosyncracy","correct":"idiosyncrasy"},
{"misspelling":"independance","correct":"independence"},
{"misspelling":"interupt","correct":"interrupt"},
{"misspelling":"intresting","correct":"interesting"},
{"misspelling":"juge","correct":"judge"},
{"misspelling":"knowlege","correct":"knowledge"},
{"misspelling":"lazer","correct":"laser"},
{"misspelling":"liason","correct":"liaison"},
{"misspelling":"libary","correct":"library"},
{"misspelling":"lightening","correct":"lightning"},
{"misspelling":"lollypop","correct":"lollipop"},
{"misspelling":"millenium","correct":"millennium"},
{"misspelling":"mischievious","correct":"mischievous"},
{"misspelling":"mispell","correct":"misspell"},
{"misspelling":"monkies","correct":"monkeys"},
{"misspelling":"morgage","correct":"mortgage"},
{"misspelling":"neccessary","correct":"necessary"},
{"misspelling":"neice","correct":"niece"},
{"misspelling":"noone","correct":"no one"},
{"misspelling":"noticable","correct":"noticeable"},
{"misspelling":"occassion","correct":"occasion"},
{"misspelling":"occured","correct":"occurred"},
{"misspelling":"oppurtunity","correct":"opportunity"},
{"misspelling":"paralell","correct":"parallel"},
{"misspelling":"pasttime","correct":"pastime"},
{"misspelling":"peice","correct":"piece"},
{"misspelling":"persistant","correct":"persistent"},
{"misspelling":"persue","correct":"pursue"},
{"misspelling":"pharoah","correct":"pharaoh"},
{"misspelling":"portugese","correct":"portuguese"},
{"misspelling":"posession","correct":"possession"},
{"misspelling":"potatoe","correct":"potato"},
{"misspelling":"preceeding","correct":"preceding"},
{"misspelling":"prefered","correct":"preferred"},
{"misspelling":"pronounciation","correct":"pronunciation"},
{"misspelling":"propoganda","correct":"propaganda"},
{"misspelling":"privelige","correct":"privilege"},
{"misspelling":"publically","correct":"publicly"},
{"misspelling":"rasberry","correct":"raspberry"},
{"misspelling":"recieve","correct":"receive"},
{"misspelling":"reccomend","correct":"recommend"},
{"misspelling":"rythm","correct":"rhythm"},
{"misspelling":"shedule","correct":"schedule"},
{"misspelling":"seige","correct":"siege"},
{"misspelling":"sentance","correct":"sentence"},
{"misspelling":"seperate","correct":"separate"},
{"misspelling":"sieze","correct":"seize"},
{"misspelling":"sincerly","correct":"sincerely"},
{"misspelling":"supercede","correct":"supersede"},
{"misspelling":"suprise","correct":"surprise"},
{"misspelling":"tatoo","correct":"tattoo"},
{"misspelling":"tendancy","correct":"tendency"},
{"misspelling":"thier","correct":"their"},
{"misspelling":"threshhold","correct":"threshold"},
{"misspelling":"tommorrow","correct":"tomorrow"},
{"misspelling":"truely","correct":"truly"},
{"misspelling":"untill","correct":"until"},
{"misspelling":"vaccuum","correct":"vacuum"},
{"misspelling":"vegeterian","correct":"vegetarian"},
{"misspelling":"wendesday","correct":"wednesday"},
{"misspelling":"whereever","correct":"wherever"},
{"misspelling":"wierd","correct":"weird"},
{"misspelling":"writen","correct":"written"}
]
}
}
Configure tokenizer logger
Logger is configuration at top level of json in logger field.
Example of Configuration:
The logger fields is:
- logging-level
It can be set to the following values:
- debug for the debug level and developper information
- info for the level of information
- warning to display only warning and errors
- error to display only error
- critical to display only error
Configure tokenizer Network
Example of Configuration:
{
"network": {
"host":"0.0.0.0",
"port":8080,
"associate-environment": {
"host":"HOST_ENVNAME",
"port":"PORT_ENVNAME"
},
"ssl":
{
"certificate":"path/to/certificate",
"key":"path/to/key"
}
}
}
The network fields:
-
host : hostname
-
port : port of the service
-
associated-environement
: default one. This field is not mandatory.
- "host" : associated "host" environment variable
-
"port" : associated "port" environment variable
-
ssl : ssl configuration IN PRODUCTION IT IS MANDATORY TO USE CERTIFICATE AND KEY THAT ARE *NOT* SELF SIGNED
-
cert : certificate file
- key : key file
Configure tokenizer runtime
Example of Configuration:
{
"runtime":{
"request-max-size":100000000,
"request-buffer-queue-size":100,
"keep-alive":true,
"keep-alive-timeout":5,
"graceful-shutown-timeout":15.0,
"request-timeout":60,
"response-timeout":60,
"workers":1
}
}
The Runtime fields:
-
request-max-size : how big a request may be (bytes)
-
request-buffer-queue-size: request streaming buffer queue size
-
request-timeout : how long a request can take to arrive (sec)
-
response-timeout : how long a response can take to process (sec)
-
keep-alive: keep-alive
-
keep-alive-timeout: how long to hold a TCP connection open (sec)
-
graceful-shutdown_timeout : how long to wait to force close non-idle connection (sec)
-
workers : number of workers for the service on a node
-
associated-environement : if one of previous field is on the associated environment variables that allows to replace the default one. This field is not mandatory.
-
request-max-size : overwrite with environement variable
- request-buffer-queue-size: overwrite with environement variable
- request-timeout : overwrite with environement variable
- response-timeout : overwrite with environement variable
- keep-alive: overwrite with environement variable
- keep-alive-timeout: overwrite with environement variable
- graceful-shutdown_timeout : overwrite with environement variable
- workers : overwrite with environement variable
Tokenizer service
To create these resources simply run
python3 thot/tasks/tokenizer/createAnnotationResource.py --entries=/home/tkeir_svc/tkeir/configs/default/configs/annotation-resources.json --output=/home/tkeir_svc/tkeir/configs/default/resources/modeling/tokenizer/en/tkeir_mwe.pkl
To run the command type simply from tkeir directory:
or if you install tkeir wheel:
A light client can be run through the command
python3 thot/tokenizer_client.py --config=<path to tokenizer configuration file> --input=<input directory> --output=<output directory>
or if you install tkeir wheel:
python3 tkeir-tokenizer-client.py --config=<path to tokenizer configuration file> --input=<input directory> --output=<output directory>
Tokenizer Tests
The converter service come with unit and functional testing.
Tokenizer Unit tests
Unittest allows to test Tokenizer classes only.
python3 -m unittest thot/tests/unittests/TestTokenizerConfiguration.py
python3 -m unittest thot/tests/unittests/TestTokenizer.py
Notes: : - if there is error due to the file tkeir_mwe.mkl it is normal. You can avoid this error by creating the
the resources model : - the model data directory is mapped into docker-compose file, please check if all the configuration files are inside this directory