Clinical mental health classifier with a dash of "Vibe coding"

How it started: vibes and Reddit posts

This project began with a somewhat impulsive idea: to build a system that could read Reddit posts and make sense of the writer’s mental and emotional state. Not diagnose, of course, because that would be presumptuous and unsafe, but rather to identify the tone and categorize it into broad clinical groups such as “Anxiety Disorder”, “Mood Disorder”, or “General Emotional Distress”. The intention was to make the model sensitive enough to notice when language hinted at crisis, and to respond by displaying mental health resources instead of cold predictions.

What began as a small college experiment gradually turned into something more serious. The prototype grew into a Flask web application powered by a hybrid architecture that combines a Convolutional Neural Network (CNN) with a TF-IDF based model. To prevent it from making risky assumptions, I added a crisis override that always intervenes when certain phrases appear, no matter what the classifier believes. It is, in essence, a careful balance between machine logic and human responsibility.

Two brains are better than one

The core of the system relies on a pair of very different models that somehow work better together than they do individually.

The CNN functions as a pattern detector. It processes sequences of words, paying attention to context, emotion, and phrasing. It notices the subtleties that humans often describe as “tone.”
The TF-IDF model is more literal. It measures the importance of each word based on how often it appears relative to the whole dataset, grounding the intuition of the CNN in statistical reasoning.

Their predictions are averaged to create a more stable output. This approach reduces the tendency of either model to overreact to individual phrases. Confidence is assessed not only by looking at the highest probability score, but also by analyzing entropy, which provides a rough estimate of how certain the model is about its own conclusions. The combination gives the system a kind of simulated self-awareness about uncertainty, which is surprisingly useful when dealing with human language.

PRAW, or how I stopped worrying and learned to fetch

The system draws its input from Reddit itself. When a user pastes a Reddit link, the backend attempts to retrieve the post using PRAW, Reddit’s official API wrapper. If PRAW is unavailable, it falls back to a more direct method using JSON scraping. The process is simple but surprisingly resilient. Title and body text are extracted, cleaned, and then merged into one block of content that becomes the model’s input.

The clean_text() function performs the quiet, essential work of removing URLs, usernames, emojis, and general social-media debris. By the time the text reaches the classifier, it has been stripped of the noise that would otherwise confuse the models. This preprocessing step is what allows the system to stay focused on meaning rather than metadata.

The red button: crisis override

Before any text reaches the classifier, it is screened for a curated list of crisis-related words and phrases. If one of these is detected, the system bypasses the model entirely and immediately returns crisis support resources. This design choice is deliberate. No matter how accurate the algorithm becomes, it should never be trusted to make a probabilistic judgment when someone expresses an intent to harm themselves.

This safeguard might seem like a technical detail, but it represents the core idea behind the entire project: that data comes from people, and treating it responsibly matters more than model performance.

Making labels human again

Many machine learning models output cryptic tags, subreddit names, or meaningless label IDs. I wanted this one to produce something that made sense to a human reader. Each classification result is mapped to a recognizable clinical category such as Anxiety Disorder, PTSD, Substance Use Disorder, or General Emotional Distress. Alongside these, the app displays a brief summary explaining what that category usually represents, along with resource links for anyone who might want help. When a message falls into a crisis category, the interface directly displays hotline numbers instead of technical information.

The web interface is intentionally minimal. It uses Chart.js to draw confidence bars and probability charts, giving users a quick sense of how certain the system feels about its predictions. The visual feedback transforms the model’s internal math into something that feels more like a dialogue, rather than a black box issuing verdicts.

Structure of a tiny classifier

Frontend: A single HTML page that uses Bootstrap for layout and Chart.js for visualization.
Backend: Flask routes for /, /fetch, and /predict handle all requests.
Models: A CNN and a TF-IDF model whose outputs are averaged after softmax normalization.
Safety: The crisis override always takes priority over classification logic.

The classifier is intentionally small and transparent. It is not built to impress benchmarks, but to explore what happens when artificial intelligence is designed to act cautiously and communicate clearly. The goal was never to replace human judgment, only to build a system that understands when to stay quiet.