Interviewz

Interviewz is a comprehensive application designed to analyze sentiments detected from video interviews across different sources:

Audio (voice tone and pitch)
Text (dialogues)
Video (facial expressions)

It uses advanced machine learning models to detect sentiments from the interview data.

Modules

1. Middleware Module

Purpose: The Middleware module acts as an orchestrator, coordinating the processing tasks across the audio, text, and video modules.

How It Works:

Initialization: The FastAPI app sets up the necessary routes and initializes CORS middleware.
Preprocessing Endpoint (`/preprocess`): Receives session and interview IDs. Initializes a `Process` class instance that manages the workflow. Downloads audio files, performs diarization (speaker identification), and converts speech to text.
Prediction Endpoint (`/predict`): Coordinates with external APIs for audio, text, and video analysis. Aggregates results and updates the database.

Key Components:

`app.py`: Defines the FastAPI application, routes, and the main processing logic.
`Process` class: Handles downloading, preprocessing, and interfacing with other APIs for complete interview analysis.
`utils/process.py`: Includes the necessary steps for audio extraction and processing, such as diarization and speech-to-text conversion.
`utils/utils.py`: Provides utility functions for logging, file handling, and database interactions.

2. Audio Module

Purpose: The Audio module processes audio data to extract and analyze emotions.

Overview: The Audio Emotions Analysis module of the Interviewz application is responsible for analyzing audio segments from interviews and determining the emotional content within these segments. The module uses FastAPI for serving the endpoints, PyTorch and torchaudio for audio processing, and a Supabase database for storing and retrieving data.

Components:

FastAPI Application (app.py)

Initializes a FastAPI application.

AudioEmotions (audioEmotions.py)

Handles the extraction of audio segments from storage and predicts emotions using pre-trained models. Utilizes torchaudio for audio handling and transformers for emotion classification.

Utilities (utils/utils.py)

Provides methods for logging, configuration management, file operations, and database interactions. Manages connections to both Supabase for data handling and S3 buckets for file storage.

Models (utils/models.py)

Manages the loading and usage of machine learning models for audio classification. Ensures models are loaded once using singleton pattern to optimize resources.

3. Text Module

Purpose: The Text module processes textual data to analyze sentiments and emotions.

Overview: The Text module of the Interviewz application processes textual data to analyze emotional content using machine learning models. It leverages state-of-the-art NLP models for sequence classification to predict emotional responses based on the text from interviews.

Components:

FastAPI Application (app.py)

Initializes a FastAPI application.

TextEmotions (textEmotions.py)

Manages the text analysis process by fetching text data, applying emotional analysis, and updating results. Utilizes pre-trained NLP models to classify text into emotional categories.

Utilities (utils/utils.py)

Includes logging setup, configuration management, and methods for file operations on S3 storage. Implements methods for updating database records and managing connections to Supabase for data storage.

Models (utils/models.py)

Responsible for loading and managing NLP models and tokenizers for text emotion classification. Implements singleton pattern to ensure models are loaded once per application lifecycle.

4. Video Module

Purpose: The Video module processes video data to detect emotions through facial recognition.

Overview: The Video module of the Interviewz application processes video data to analyze emotional content using advanced computer vision models. It handles video segments, applies emotion recognition using frame analysis, and updates emotional data back to the database.

Components:

FastAPI Application (app.py)

Initializes a FastAPI application.

VideoEmotions (videoEmotions.py)

Manages the video analysis process by fetching video data, applying frame-by-frame emotion analysis, and updating results. Utilizes DeepFace for facial emotion recognition (planned to be replaced by a YOLO model).

Utilities (utils/utils.py)

Provides methods for configuration management, database interactions, file handling, and logging. Manages connections to Supabase for data handling and S3 buckets for file storage.

Models (utils/models.py)

Handles configuration settings but currently does not load specific models due to the module's reliance on DeepFace.

API Endpoints

Middleware Module

GET /health:
POST /preprocess:
POST /predict:

Audio Module

GET /health:
POST /analyse_audio:

Text Module

GET /health:
POST /analyse_text:

Video Module

GET /health:
POST /analyse_video:

Directory Structure

interviewz/
│
├── middleware/
│   ├── app.py
│   └── utils/
│       ├── process.py
│       └── utils.py
│
├── audio/
│   ├── app.py
│   ├── audioEmotions.py
│   └── utils/
│       ├── models.py
│       └── utils.py
│
├── text/
│   ├── app.py
│   ├── textEmotions.py
│   └── utils/
│       ├── models.py
│       └── utils.py
│
├── video/
│   ├── app.py
│   ├── videoEmotions.py
│   └── utils/
│       ├── models.py
│       └── utils.py

Repository

Check out the project on GitHub.

Contact

For any questions, please contact the project maintainers at contact@interviewz.online.