💶Kafka-SparkStreamNLP 是一个基于docker容器化管理的实时金融文本分析平台,通过新闻api,采用 Kafka 进行数据流管理,使用 Spark Streaming 结合微调预训练模型finetuning进行NLP处理,并通过输出流将结果存储在clickhouse以便后续使用可视化平台进行olap分析⭐️⭐️⭐️⭐️⭐️
-
Updated
Feb 13, 2025 - Jupyter Notebook
💶Kafka-SparkStreamNLP 是一个基于docker容器化管理的实时金融文本分析平台,通过新闻api,采用 Kafka 进行数据流管理,使用 Spark Streaming 结合微调预训练模型finetuning进行NLP处理,并通过输出流将结果存储在clickhouse以便后续使用可视化平台进行olap分析⭐️⭐️⭐️⭐️⭐️
Academic Sequence Labelling Between DistillBERT & Encoder-only Transformer
Extrinsic and Intrinsic Plagiarism detection
Work focus on Transformer model to Start classification (1-5) about reviews of YELP.
Advanced RoBerta and DistillBert Based Abstract Based Sentiment Analyzer. Ensemble Architecture
End-to-end backend system for extracting structured data from documents using OCR, transformer-based NER, and asynchronous processing with FastAPI.
Streamlit-based Business Analytics System for call sentiment analysis, audio transcription, and live prediction using ML and NLP.
Distilbert model for sentence segmentation.
A mobile app that uses Machine Learning to detect and block spam or scam texts.
AI-Enhanced Exploratory Analysis and Preprocessing of Simulated Patient Vital Signs Data
Movie review sentiment analysis by using DistillBERT
Classification of Text from Youtube Comments using BistillBERT alanguage models from Hugginface Transformers
AI-powered medical symptom checker using DistilBERT (Small Language Model) to predict possible diseases from user-reported symptoms with confidence visualization.
A machine-learning project that analyzes the sentiment of tweets using deep learning and NLP techniques. The model classifies tweets into positive or negative sentiment, using preprocessing, tokenization, and training on a labeled dataset. Includes data cleaning, visualization, model training, evaluation (accuracy, precision, recall, F1-score),
NLP pipeline for sentiment analysis and BERTopic topic modelling on 212,000+ Google Maps restaurant reviews.
Small NLP projects with Deep Learning techniques
End-to-end Fake News Detection & Generation system using GPT-2 for headline generation and DistilBERT for real/fake classification. Features data preprocessing, model training, FastAPI backend, Streamlit UI, and cloud deployment. Evaluated using Accuracy, Precision, Recall, F1, Perplexity, Distinct-n, and Self-BLEU.
Hybrid NLP and RAG-based system that analyzes customer feedback, detects sentiment-rating anomalies, and generates actionable business insights using LLMs.
A real-time dashboard visualizing sentiment analysis using Hugging Face transformers, via streamlit
Add a description, image, and links to the distillbert-model topic page so that developers can more easily learn about it.
To associate your repository with the distillbert-model topic, visit your repo's landing page and select "manage topics."