Learn how to configure document chunking in your R2R deployment
r2r
: Default offering for light
installations, a simple and lightweight parser included in R2R.unstructured_local
: Default offering for full
installations, makes use of open source Unstructured package.unstructured_api
: Cloud offering of Unstructuredr2r.toml
:
provider
: The chunking provider (defaults to “r2r”).chunking_strategy
: The chunking method (“recursive”).chunk_size
: The target size for each chunk.chunk_overlap
: The number of characters to overlap between chunks.excluded_parsers
: List of parsers to exclude (e.g., [“mp4”]).strategy
: The overall chunking strategy (“auto”, “fast”, or “hi_res”).chunking_strategy
: The specific chunking method (“by_title” or “basic”).new_after_n_chars
: Soft maximum size for a chunk.max_characters
: Hard maximum size for a chunk.combine_under_n_chars
: Minimum size for combining small sections.overlap
: Number of characters to overlap between chunks.full
provider, using the open-source Unstructured library for local processing.ingest_files
endpoint, allowing dynamic adjustment of chunking parameters based on the input documents or specific use cases.