.. _index:

======================
Welcome to Xpark!
======================

.. toctree::
   :maxdepth: 2
   :hidden:

   reference/get_started
   examples/index
   reference/index
   reference/processors


Xpark is a multimodal AI data processing platform designed to streamline and optimize data workflows for AI applications.
It provides comprehensive capabilities for data handling, transformation, and seamless integration with AI workflows.


Processing Multimodal Data with Xpark
--------------------------------------

.. tabs::
   .. code-tab:: python Text

      from xpark.dataset import TextEmbedding, from_items
      from xpark.dataset.expressions import col

      ds = from_items(
         [
            "what is the advantage of using the GPU rendering options in Android?",
            "Blank video when converting uncompressed AVI files with ffmpeg",
         ]
      )
      ds = ds.with_column(
         "embedding",
         TextEmbedding(
            # Local embedding model.
            "Qwen/Qwen3-Embedding-0.6B",
         )
         .options(num_workers={{"CPU": 1}})
         .with_column(col("item")),
      )

      output = ds.take_all()
      

   .. code-tab:: python Image

      from PIL import Image

      from xpark.dataset import ImageCompute, ImageTextSimilarityScore, read_image
      from xpark.dataset.expressions import col

      ds = read_image("/data/Test/test-ray-data/data/mini_coco_images")

      # Image Data Function: resized_image
      ds = ds.with_column("image_resized", ImageCompute.resize(col("image"), size=(224, 224)))

      # Image AI Function: image text similarity score
      ds = ds.with_column(
         "image_text_similarity",
         ImageTextSimilarityScore(text="a photo of a cat")
         .options(batch_size=16, num_workers={"CPU": 1})
         .with_column(col("image")),
      )

      output = ds.take_all()

      print(output[0]["image_text_similarity"])
      Image.fromarray(output[0]["image_resized"]).show()

   .. code-tab:: python Video

      import pyarrow as pa

      from xpark.dataset import VideoCompute, from_arrow
      from xpark.dataset.expressions import col

      ds = from_arrow(
         pa.table(
            {
                  "video": ["/path/to/video1.mp4", "/path/to/video2.mp4"],
            }
         )
      )

      # Get Video Bit Rate
      ds = ds.with_column("video_bit_rate", VideoCompute.bit_rate(col("videos")))

      # Extract Audio
      ds = ds.with_column("audio", VideoCompute.extract_audio(col("video"), codec="aac", sample_rate=16000))

      # Extract frames
      ds = ds.with_column("frames", VideoCompute.extract_frames(col("video"), start_time=30, end_time=50, num_frames=3))


      output = ds.take_all()

   .. code-tab:: python Audio

      from __future__ import annotations

      from xpark.dataset.expressions import col
      from xpark.dataset import SpeechToText, from_items

      ds = from_items(["multilingual.mp3"])
      ds = ds.with_column(
         "text",
         SpeechToText(
            # Local transcriptions model.
            "Systran/faster-whisper-large-v3",
         )
         .options(num_workers={{"GPU": 1}})
         .with_column(col("item")),
      )

      print(ds.take_all(2))
      
      
Next Steps
--------------------------------------

- :ref:`get_started` — A quick tutorial to get you started with Xpark
- :ref:`reference_index` — Full Dataset API reference
- :ref:`processors` — All built-in Data and AI Processors