Greetings!
My name is Roman Kyrychenko, and I am a passionate Data Scientist with over eight years of hands-on experience in data analysis, research, machine learning, and deep learning. My journey in the world of data has been driven by a keen interest in statistics and machine learning, leading me to explore various technologies, environments, and programming languages. I am currently pursuing a PhD at the University of Helsinki, where my research focuses on the intricate dynamics of political polarization and propaganda analysis, particularly in the context of Ukraine.
Beyond my academic pursuits, I have a proven track record in model deployment and maintenance, having delivered more than 500 hours of lectures to students on Data Science, Big Data, Deep Learning, and R development. My professional experience spans leading and contributing to advanced data science and deep learning projects across diverse domains, ensuring high-quality deliverables and innovative solutions.
My Philosophy: The “Random Forest Random” Pun
As for the name of this blog, it’s a playful pun! Random forests, or random decision forests, are a powerful ensemble learning method used for classification, regression, and other tasks. They operate by constructing a multitude of decision trees during training and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. The name “Random Forest Random” seemed funny to me, and I hope it brings a smile to your face too! It reflects my approach to data science: combining rigorous methodology with a touch of creativity and humor.
Expertise and Interests
In this blog, I will be delving into a variety of topics that reflect my diverse expertise and interests. These include R development, advanced machine learning techniques (extending far beyond just random forests), sophisticated plotting using ggplot2
, web scraping, spatial analysis, and in-depth discussions on my research into polarization and propaganda. My aim is to provide valuable insights, share practical experiences, and contribute to the broader data science community.
My technical proficiency encompasses a wide array of tools and languages. I am an expert in Python, utilizing libraries such as PyTorch, LangChain, LangGraph, llamaindex, pandas, NumPy, TensorFlow, scikit-learn, Flask, and FastAPI. My R expertise includes caret, tidyverse, and Shiny. I am also proficient in SQL and LATEX. I have extensive experience with databases like PostgreSQL, Qdrant, Pinecone, Elasticsearch, Hive, and LanceDB, among others. My toolkit also includes PyCharm, RStudio, Jupyter, Docker, Kubernetes, Databricks, AWS, Azure, and Google Cloud.
I specialize in areas such as Natural Language Processing (NLP), sentiment analysis, topic modeling, fraud detection, recommender systems, image classification and segmentation, credit scoring, anomaly detection, network analysis, and question-answering models/chatbots. I am also skilled in model deployment using Flask, FastAPI, and Reflex, and proficient in CI/CD tools like Git, GitLab CI, and Bitbucket Pipelines.
Professional Journey and Contributions
My professional journey includes significant roles such as Senior Data Scientist/Team Lead at SoftTeco, where I led numerous advanced data science and deep learning projects. Highlights include building a job and CV matching system leveraging LLMs, developing an LLM-based one-shot learning classifier, and creating a secure chat for developers. I also contributed to demand forecasting systems and fraud detection models.
Prior to SoftTeco, I worked as a Machine Learning Engineer at Govitall, focusing on GPT2-like language models, and as a Data Scientist at VEON/Kyivstar, where I developed various predictive and segmentation models. At CorestoneGroup, I built systems for news topic definition, social media scraping, and data visualization applications.
My commitment to education is evident through my roles as a Lecturer at Ukrainian Catholic University, teaching R for Econometrics and Football Analytics, and as a teacher at ITEA, covering Python/R for Data Science and Deep Learning. I also served as an R Coach at Q&Q Research, instructing on APIs, web scraping, and data manipulation.
Entrepreneurial Ventures and Academic Research
I am also a co-founder of innovative startups. At FOG, I developed an AI solution for automating data analysis and report generation, including a SaaS platform for expert search. As a co-founder of All-Seeing AI, I am actively involved in developing AI solutions for online propaganda detection and monitoring. I also contributed as a Data Scientist to anticorupt.in.ua, developing models to estimate corruption risks.
My academic research is published in reputable journals, covering topics such as the dynamics of aggressive discourse, the value dimension of public opinion, and social polarization. I have also contributed to pre-prints and datasets related to war-related Telegram channels. My PhD thesis, “Examining Polarisation Dynamics: The Case of Ukraine,” is a testament to my dedication to understanding complex social phenomena through data-driven approaches.
Thank you for visiting my blog, and I look forward to sharing my knowledge and insights with you.