Filip Boltuzic, Developer in Zagreb, Croatia
Filip is available for hire
Hire Filip

Filip Boltuzic

Verified Expert  in Engineering

Machine Learning Engineer and Developer

Zagreb, Croatia
Toptal Member Since
April 30, 2020

Filip是一名拥有多年专业经验的机器学习工程师. 作为软件开发人员,他曾在亚马逊网络服务公司(Amazon Web Services)从事大规模问题的研究,并在萨格勒布大学(University of Zagreb)担任助理研究员,建立了自然语言处理模型. Filip's main interests are machine learning and natural language processing, with an emphasis on building text classification models.


Online freelance agency
Machine Learning, Supervised Machine Learning, Reinforcement Learning...
ChatGPT, Haystack, Natural Language Processing (NLP), Machine Learning...
Machine Learning, Python, DevOps, Data Science, Amazon Web Services (AWS)...




Preferred Environment

Java, Git, Linux, Docker, Apache Solr, Django, PyTorch, Pandas, NumPy, Scikit-learn, Python

The most amazing...

...我开发的机器学习模型是一个LSTM和CRF模型,用于将文本分割为论证性声明,这是我博士学位的一部分.D. thesis.

Work Experience

Research Advisor

2022 - PRESENT
Online freelance agency
  • Investigated, researched and documented caching methods in software.
  • 复制了最流行的缓存方法来预测研究论文中的生存时间.
  • 建立了一个模拟器和强化学习模型,试图解决对象缓存的TTL预测问题.
Technologies: Machine Learning, Supervised Machine Learning, Reinforcement Learning, Deep Reinforcement Learning, Data Science, NumPy

RAG/GPT4 Expert

2024 - 2024
  • 改进了现有的基于rag的工具,以帮助团队更有效地搜索内部文档.
  • Built a document processing system for different types of content (emails, DOCX attachments, Excel spreadsheets, etc).
  • Utilized RAG to implement the 1st version of multiple-question answering.
Technologies: ChatGPT, Haystack, Natural Language Processing (NLP), Machine Learning, OpenAI GPT-4 API, Azure, Amazon Web Services (AWS), Retrieval-augmented Generation (RAG), Generative Pre-trained Transformers (GPT)

Technical Blog Writer

2023 - 2024
  • Wrote several technical blogs on various topics such as machine learning, quantum computing, cloud computing, and large language models.
  • 跨三个云提供商(AWS、Google cloud和Azure)实现可重复的工作流.
  • Contributed to the open source workflow covalent library.
Technologies: Machine Learning, Python, DevOps, Data Science, Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP)

AI and ML Developer

2023 - 2023
Aggieland Software
  • 开发大型语言模型(LLM) LangChain bot生成软件需求.
  • 构建并部署到云上的多进程应用程序通过API公开,该API可以与用户聊天以生成软件需求.
  • 与两个团队合作,通过api集成LLM应用程序,提供web和移动应用程序访问LLM应用程序.
Technologies: Artificial Intelligence (AI), Machine Learning, Azure Machine Learning, Large Language Models (LLMs), Llama 2, FastAPI, LangChain

AI Expert

2023 - 2023
PD4 Solutions LLC
  • 开发了一个基于llm的解决方案,以确定哪些科学文章与用户输入的自由文本标准相关.
  • 评估了LLM解决方案的性能,并演示了证明比以前实现的解决方案有很大改进的指标.
  • 与机器学习工程师一起部署解决方案并定义应用LLM解决方案的最佳架构.
Technologies: Artificial Intelligence (AI), Machine Learning, Python, Natural Language Processing (NLP), Language Models, Text Classification, Unsupervised Learning, LangChain, Amazon Web Services (AWS), Git, GPT, Text Generation

Senior Data Scientist

2021 - 2023
Freelance for Lionbridge (via Newfire Global Partners)
  • 在文本数据上开发了一个机器学习序列标注模型,达到0以上.9 F1 score.
  • 在不牺牲F1分数的情况下,减少了先前开发的机器学习模型的推理时间.
  • 使用PySpark和Databricks执行大规模的数据分析,公司使用这些数据来推动未来的业务决策.
  • 开发了多个高度可伸缩的Python web服务,这些服务目前正在为生产流量服务.
Technologies: Python, Agile, Scrum, Web Services, JSON, PyTorch, SpaCy, Natural Language Toolkit (NLTK), PySpark, Jupyter, Databricks, Open Neural Network Exchange (ONNX), Neural Networks, LSTM, Pandas, Data Science, NumPy, Git, Natural Language Processing (NLP), Data Analysis, Azure Databricks

Data Science Engineer

2022 - 2022
  • 开发原型产品推荐,显示客户的购买模式.
  • Built simple AWS Lambda functions to conduct an ETL workflow.
  • Worked with PySpark on large sets of data (>100GB of historical purchases).
Technologies: Python, Machine Learning, Spark ML, Scikit-learn, PySpark, Amazon Web Services (AWS), Git

Machine Learning Engineer

2020 - 2021
Alchemy V Ltd (via Toptal)
  • 使用拥抱脸转换器/文本生成管道和客户提供的数据创建了一个营销口号文本生成器.
  • 通过多个Google云服务BigQuery创建了一个数据摄取和报告流程, Cloud Functions, Cloud Endpoints, and Dataproc.
  • Ported existing R reporting code to a Python web service.
技术:Google Cloud, Google Cloud API, Google BigQuery, R, Python, Text Generation, SQL, Git

Natural Language Processing (NLP) Consultant

2020 - 2021
Granville Knowledge Management (via Toptal)
  • Developed a scraper to download a large (around 20,000)和各种法律文件(1990年至今)从欧洲公共存储库.
  • 利用机器学习构建文本分类模型,实现基于文档内容的自动分类.
  • 创建了一个法律文件数据集,并使用它来训练和评估构建的机器学习文本分类模型. 通过谷歌协作共享结果,这样客户就可以用他们持有的数据交互地尝试模型的性能.
Technologies: Python, Scrapy, Web Scraping, PyTorch, Jupyter, Google Colaboratory (Colab), Text Classification, Natural Language Processing (NLP)

Research Associate

2018 - 2020
TakeLab at the University of Zagreb
  • Developed a search engine for Croatian legal documents.
  • 结合LSTM和CRF,在PyTorch中构建了一个命名实体识别模型.
  • 指导过几个学生做实习项目,并撰写了自然语言处理方面的硕士论文.
Technologies: Scikit-learn, PyTorch, Apache Solr, Django, Python, Torch, Pandas, Data Science, Git, Natural Language Processing (NLP)

Software Development Engineer

2014 - 2017
Amazon Web Services (AWS)
  • 用Java和c++开发了一个可伸缩的时间序列数据库解决方案, which served around 1 million requests/second.
  • Served as the team scrum master and product owner.
  • 设计并实现了一个网络关联引擎微服务来处理来自整个亚马逊网络的网络事件(专利授予http://patents)
Technologies: Amazon Web Services (AWS), C++, Python, Java, Algorithms, Programming, Agile, Git, Web Services

Business Intelligence Analyst

2012 - 2014
Zagrebacka banka Unicredit Group
  • 开发SQL报告,以确定数据仓库中有前景的零售策略.
  • 用Java构建了一个交互式工具,以加快Oracle Data Integrator中的流程.
  • 使用PL/SQL和Oracle Apex为会计部门开发小型web应用程序.
Technologies: Java, SQL, Data Science

Search Engine for Croatian Legal Documents

A Django and Apache Solr web application.

我是这个项目的首席开发人员,并提出了系统架构作为一组微服务. The documents were stored and indexed in Solr, whereas the Django front end served requests and communicated with Solr.

Retail Sale Forecasting

该项目是设计一个基于历史订单数据预测销售数量的模型, previous sales, and regions. 预测是在区域和全球一级进行的,并作为时间序列预测事项. 我尝试了几种时间序列预测技术,如ARIMA和SARIMA模型.

Developed a ChatGPT-like tool for legal professionals in the EU. It allows users to get an answer to any legal question with source citations. 它借鉴了国内官方法律文件、欧盟法律以及国内和欧盟判例法.

The tool is currently under development as part of a startup named Ulpian.
2012 - 2020

Ph.D. Degree in Natural Language Processing

University of Zagreb - Zagreb, Croatia

2010 - 2012

Master's Degree in Computer Science

University of Zagreb - Zagreb, Croatia

2010 - 2011

Erasmus Exchange Study in Computer Science

KTH Royal Institute of Technology - Stockholm, Sweden

2007 - 2010

Bachelor's Degree in Computer Science

University of Zagreb - Zagreb, Croatia


Convolutional Neural Networks



Scikit-learn, NumPy, Pandas, PyTorch, Google Cloud API, SpaCy, Natural Language Toolkit (NLTK), PySpark, LSTM, Spark ML


Vim Text Editor, Solr, Apache Solr, Git, Oh My Zsh, Boto, Jupyter, LaTeX, Azure Machine Learning, ChatGPT, Haystack


Python, SQL, Haskell, Java, C++, R




Elasticsearch, Google Cloud, JSON, PostgreSQL




Django, Scrapy, Streamlit


Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Artificial Intelligence (AI), Machine Learning, Back-end, OpenAI GPT-3 API, Data Analysis, Azure Databricks, Retrieval-augmented Generation (RAG), Clustering Algorithms, Clustering, Classification Algorithms, Text Classification, Torch, Web Scraping, Google Colaboratory (Colab), Google BigQuery, Text Generation, Web Services, Open Neural Network Exchange (ONNX), Neural Networks, Research, Student Engagement, Supervised Machine Learning, Time Series, Autoregressive Integrated Moving Average (ARIMA), LangChain, OpenAI, Reinforcement Learning, Deep Reinforcement Learning, Algorithms, Programming, Heuristics, Optimization, Evolutionary Computation, Genetic Algorithms, Convolutional Neural Networks (CNN), Sorting Algorithms, Pattern Recognition, Language Models, Unsupervised Learning, Big Data, Unstructured Data Analysis, Large Language Models (LLMs), Llama 2, FastAPI, Prompt Engineering, OpenAI GPT-4 API, Pinecone, FAISS

Collaboration That Works

How to Work with Toptal



Share your needs


Choose your talent


Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring