Build a Lightning-Fast YouTube Video Summarizer with Carbon AI and OpenAI

Web DevelopmentArtificial Intelligence

Dean2024-11-28 · 8 min read

Building a Lightning-Fast YouTube Video Summarizer with Carbon AI and OpenAI

In this tutorial, we'll walk through creating a Node.js application that automatically summarizes YouTube videos by leveraging Carbon AI's transcript API and Large Language Models. While traditional approaches to video summarization often involve multiple steps like downloading videos, using separate speech-to-text tools, and manual processing, our solution streamlines this into a much simpler process. Instead of dealing with youtube-dl, managing audio files, and juggling multiple APIs for transcription and summarization, we directly fetch transcripts and process them in one smooth workflow. This cuts down both development time and processing time significantly, letting you focus on actually using the insights rather than extracting them.

The Traditional vs. Our Approach

Traditional video processing pipelines typically require:

Setting up youtube-dl/yt-dlp for video downloading
Extracting audio from videos
Implementing speech-to-text with tools like Whisper
Setting up a separate summarization pipeline
Managing local file storage and cleanup

Our approach eliminates these complexities by:

Directly fetching ready-to-use transcripts
Processing them with a single API call

This reduces both development time (from hours to minutes) and processing time (from multiple API calls and file operations to just two main steps).

Prerequisites

Node.js installed on your system
API keys for Carbon AI and OpenAI
Basic understanding of async/await in JavaScript

Step 1: Initial Setup

First, let's install the required dependencies:

npm install axios openai dotenv

Create a .env file to store your API credentials:

CARBON_API_KEY=your_carbon_api_key
CARBON_CUSTOMER_ID=your_customer_id
OPENAI_API_KEY=your_openai_key

Step 2: Authentication with Carbon AI

The first step in our process is obtaining an access token from Carbon AI. Here's how we implement it:

async function getCarbonAccessToken() {
  try {
    console.log("Obtaining Carbon access token...");
    const response = await axios.get(
      "https://api.carbon.ai/auth/v1/access_token",
      {
        headers: {
          authorization: `Bearer ${CARBON_API_KEY}`,
          "customer-id": CARBON_CUSTOMER_ID,
        },
      }
    );

    return {
      accessToken: response.data.access_token,
      refreshToken: response.data.refresh_token,
    };
  } catch (error) {
    console.error(
      "Error getting access token:",
      error.response?.data || error.message
    );
    throw error;
  }
}

Step 3: Fetching YouTube Transcripts

Once we have our access token, we can fetch the transcript using Carbon's API:

async function fetchYouTubeTranscript(videoId, accessToken) {
  try {
    const response = await axios.get(
      "https://api.carbon.ai/fetch_youtube_transcript",
      {
        headers: {
          authorization: `token ${accessToken}`,
        },
        params: {
          id: videoId,
          raw: false,
        },
      }
    );

    if (response.data.error) {
      throw new Error(`Carbon API Error: ${response.data.error}`);
    }

    return response.data.data;
  } catch (error) {
    console.error(
      "Error fetching transcript:",
      error.response?.data || error.message
    );
    throw error;
  }
}

Step 4: Processing with OpenAI

We'll use GPT-4 to generate a concise, actionable summary of the transcript:

async function summarizeTranscript(transcript) {
  try {
    const openai = new OpenAI({
      apiKey: process.env.OPENAI_API_KEY,
    });

    const completion = await openai.chat.completions.create({
      model: "gpt-4",
      messages: [
        {
          role: "system",
          content: `You are a helpful assistant that creates concise, actionable summaries of video content. 
                    For each transcript, provide:
                    1. Main topics covered
                    2. Key takeaways
                    3. Actionable steps or recommendations`,
        },
        {
          role: "user",
          content: `Please analyze and summarize this video transcript: ${transcript}`,
        },
      ],
    });

    return completion.choices[0].message.content;
  } catch (error) {
    console.error("Error generating summary:", error.message);
    throw error;
  }
}

Step 5: Putting It All Together

Finally, we create a main processing function that coordinates all these steps:

async function processVideo(videoUrl) {
  try {
    // Extract video ID from URL
    const videoId = videoUrl.split("v=")[1];
    if (!videoId) {
      throw new Error("Invalid YouTube URL");
    }

    // Get access token
    const { accessToken } = await getCarbonAccessToken();

    // Fetch transcript
    const transcript = await fetchYouTubeTranscript(videoId, accessToken);

    // Generate summary
    const summary = await summarizeTranscript(transcript);

    return {
      videoId,
      transcript,
      summary,
    };
  } catch (error) {
    console.error("Error processing video:", error.message);
    throw error;
  }
}

Ultra-Compact Version

If you want the most streamlined version possible, here's everything compressed into a single function:

async function summarizeVideo(url) {
  const { accessToken } = await axios.get(
    "https://api.carbon.ai/auth/v1/access_token",
    {
      headers: {
        authorization: `Bearer ${process.env.CARBON_API_KEY}`,
        "customer-id": process.env.CARBON_CUSTOMER_ID,
      },
    }
  );

  const transcript = await axios.get(
    "https://api.carbon.ai/fetch_youtube_transcript",
    {
      headers: { authorization: `token ${accessToken}` },
      params: { id: url.split("v=")[1], raw: false },
    }
  );

  const { data } = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      {
        role: "system",
        content:
          "Create concise, actionable summaries with: 1. Main topics 2. Key takeaways 3. Action steps",
      },
      { role: "user", content: `Summarize: ${transcript.data.data}` },
    ],
  });
  return data.choices[0].message.content;
}

Usage Example

Here's how to use the summarizer:

async function main() {
  try {
    const videoUrl = "https://www.youtube.com/watch?v=uDS5NsvnIC0";
    const result = await processVideo(videoUrl);

    console.log("\nVideo Summary:");
    console.log("=============");
    console.log(result.summary);

    // Save to file
    const fs = require("fs");
    fs.writeFileSync("video_summary.json", JSON.stringify(result, null, 2));
    console.log("\nFull results saved to video_summary.json");
  } catch (error) {
    console.error("Main process error:", error.message);
  }
}

main();

Key Features

Automatic transcript fetching from YouTube videos
Proper authentication handling with Carbon AI
Integration with OpenAI's GPT-4 for intelligent summarization
Error handling and logging throughout the process
Saves results to a JSON file for later use

Future Improvements

Add support for batch processing multiple videos
Implement caching for API responses
Add more customization options for summary format
Create a web interface for easier access
Add support for different languages

Conclusion

This Node.js application demonstrates how to combine different AI services to create a powerful video summarization tool. By leveraging Carbon AI's transcript API and OpenAI's language models, we can automatically generate concise, actionable summaries of YouTube videos in seconds instead of spending hours watching them.

The code is modular and can be easily extended to support additional features or integrated into larger applications. Whether you're doing research, content creation, or just trying to save time, this tool can help you quickly extract the key information from YouTube videos.

Try it out and let me know how it works for you! Feel free to modify the code to suit your specific needs or contribute improvements to make it even better.