This paper proposes a hybrid stemming algorithm for Persan (Farsi). This hybrid stemming algorithm is based on both Dictionary look-up and affix removal.
This is a paper explaining Rootkits and the types of Rootkits.
This paper is an analysis of the Lexical Bundles in Computational Linguistics' academic literature this study is mostly on EAP (English for Academic Purposes).Adel Rahimi, (2015) "Critical Discourse analysis of Adolf Hitler’s Speeches". Unpublished Manuscript, IKIU. DOI: 10.6084/m9.figshare.1491387 (Data).
This paper is a Critical Discourse Analysis of Hitler's speeches the study is based on a corpus from all his speeches.
NLTK, Bokeh, Matplotlib, Scipy, Flask, Selenium, Requests, Beautifulsoup, Scrapy, Django, scikit-learn
NLP, Text mining (TM)
HTML-CSS (Responsive Designs)
Elasticsearch, MongoDB (PyMongo) and other NoSQL databases, MySQL and similar SQL databases
Beginner familiarity with the Hadoop framework, Spark, and MapReduce algorithm
Chart.js, Power BI, tableau, RDF, OWL
Familiar with Software Quality Assurance and testing, Familiar with CMMI, familiar with SDLC, Agile method, YouTrack (for issue tracking and project management), Jira
Git (Github, Gitlab, and Bitbucket), Familiar with Machine Learning approaches, Rapidminer, SPSS (Data Mining), Weka, Orange, Apache OpenNLP, Object-oriented designs and MVC Designs, Linux and Mac (Primary), Wordpress, familiar with cloud tools such as Heroku and Amazon S3
This online website uses AI to turn your texts into presentations.
Altervocab is an app that takes informal writings into formal writings using N-grams.
Simple tools for converting parameters of RFC (Rise Fall Connection) to Tilt intonational models and vice versa. (Github Page) (Released under GPLv3).
New update is coming soon.
Adel Rahimi. (2015). RFC-Tilt v1.2. Zenodo. DOI: 10.5281/zenodo.29616
The First Stemmer for Kurmanji Kurdish. (Github Page) (Released under GPLv3)
Rule-based Stemmer written in Python that includes most of the Kurdish suffixes including: 'ek', 'van', 'dar', 'kar', 'xane', 'stan', 'geh', 'én', 'an', 'yan', 'mend', 'em', 'émin', 'in', 'tir'.
2nd release just came in with more suffixes and typos cleaning.
If you want to cite any versions of the stemmer you can visit Zenodo home for Kurmanji Stemmer.
Adel Rahimi. (2015). Kurmanji-Stemmer: The second release. Zenodo. DOI: 10.5281/zenodo.29605
Simple Persian to Kurmanji transliteration. (Github page)
This Praat script gives the details of pitch (maximum, minimum) for labeled tiers in several audio files in bulk. (Github page)
This Praat script converts the pitch units.
Units supported: Hertz, Semitone, Bark, Mel. (Github page)
this script was not originally written by me but I took the liberty to correct it and comment it.
The First speech corpus for Kurmaji (Kurmanci) Kurdish. (in prep. though partially available)
sample files available at: adelra.github.io/ksc
For more information regarding the corpus contact me.
balaxan corpus of kurmanji contains 58 utterances of Kurmanji language. (Github page)
Rahimi, Adel, 2015, "Balaxan corpus of Kurmanji", LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague, HDL:11372/LRT-1531
Corpus of Wikipedia in Kurmanji language.
Adel Rahimi, "Kurmaji Wikipedia Corpus", DOI:10.7910/DVN/OHWSUI, Harvard Dataverse Network, V1
This is the first corpus of world National Anthems consisting of ~264 national anthems from ~194 countries across the globe.
For more information read the paper: Adel Rahimi. "Corpus study of World National Anthems". DOI:10.13140/RG.2.1.2953.3924 (2015).
Adel Rahimi, "Corpus of World National Anthems", DOI:10.7910/DVN/PZG8TH, Harvard Dataverse Network, V1
The Dakhil Wordlist consists of roughly 260K words from Persian (though there may be some duplicates) ending in "ان" "ات" "ون" "ین" but it is not part of their morphological boundary. the wordlist has been uploaded in both the affixes in one txt file
and separated by their ending.
For more information read the paper: Adel Rahimi, (2015) "A hybrid stemming algorithm for Persian". ArXiv: 1507.03077.
Adel Rahimi , 2015, "Dakhil wordlist for Persian vocabulary", DOI:10.7910/DVN/MJBHLN, Harvard Dataverse, V1
The corpus of Computational linguistics is an 8 million corpus of Journal publications, books, and theses. these include interdisciplinary topics such as Speech Recognition, Experimental Phonology, Language Models, Machine Learning, Semantics, Syntactic
Theory, and Information Retrieval.
Adel Rahimi , 2015, "Corpus of Computational Linguistics' Academic Literature", DOI:10.7910/DVN/YHHTCI, Harvard Dataverse, V2
Adel Rahimi , 2015, "swadesh: swadesh list for kurmanji", DOI:10.5281/zenodo.35675, Zenodo, V1.1
This is the corpus used in the research "Critical discourse analysis of Adolf hitler's speeches"
Adel Rahimi , 2015, "Replication Data for: Rahimi, A., Critical discourse analysis of Adolf hitler's speeches", DOI:10.7910/DVN/SOANL2, Harvard Dataverse, V1