: Re: How do I remove Nikkud (vowel marks) from a Word 2016 document? I am working on a commentary on Ethics of the Fathers and I want readers to be able to read sources I'm quoting in their
A quick Google search on hebrew remove nikkud gave an answer.
On Github there's a JavaScript with a live preview code. If it's little text you could use the JavaScript either online or download and use it on your pc (save as .js).
The Hebrew charcodes are all between 1425 and 1479 and the nikkud are between 0591 and 05C7.
Python implementation (tested):
import unicodedata
# nikkud-test.txt is the file you save your text in.
f= open('nikkud-test.txt','r', encoding='utf-8')
content = f.read()
normalized=unicodedata.normalize('NFKD', content)
no_nikkud=''.join([c for c in normalized if not unicodedata.combining(c)])
no_nikkud
f.close()
f = open('no-nikkud-test.txt','w',encoding='utf-8')
fw = f.write(no_nikkud)
f.close()
This works very fast.
UPDATED:
How to use this script?
Download Python 3.x.x from the python.org
Save your nikkud text to nikkud-test.txt in whatever directory
From the start menu start your cmd shell/command prompt/terminal.
Move to directory where you saved your file by typing cd followed by the directory
type python or open an iPython console.
copy + paste script
no-nikkud-test.txt will show up in the same directory
UPDATE without Terminal (Tested with Python 3.5 IDLE and iPython)
Download Python 3.5 or higher from python.org
Save your niqqud text to niqqud.txt in your Documents folder. (Windows / Mac)
Open IDLE from the Start Menu. (Alternatively, use iPython)
Copy and paste the function below:
def hasar_niqqud(source="niqqud.txt"):
"""This function removes niqqud vowel diacretics from Hebrew.
@param source: The source filename with .txt extension."""
import os, unicodedata
path = os.path.expanduser('~/Documents/'+str(source))
f= open(path,'r', encoding='utf-8')
content = f.read()
normalized=unicodedata.normalize('NFKD', content)
no_niqqud=''.join([c for c in normalized if not unicodedata.combining(c)])
f.close()
path = os.path.expanduser('~/Documents/'+str(source)[:-4]+"-removed.txt")
f = open(path,'w',encoding='utf-8')
f.write(no_niqqud)
f.close()
Then run the function with this code:
hasar_niqqud()
That's it! You can find the output in the Documents folder niqqud-removed.txt
More posts by @Welton431
: How to make a deceitful trainwreck of a character likeable So, due to a smaller miracle, a scrapped concept, a badly timed Eragon review from Nostalgia Critic and the SAO abridged came together
: The Good, the Bad, and the Semicolon I completed my novel and an editor friend graciously offered to assist me with formatting. As a former scientist, I am more familiar with technical or academic
Terms of Use Privacy policy Contact About Cancellation policy © selfpublishingguru.com2024 All Rights reserved.