# Storage Providers

Storage abstraction layer supporting multiple cloud providers with dict-based configuration.

Prerequisites

Complete the [Installation Guide](/ko/installation.md) with the `all` extra (e.g., `pip install -e ".[all]"`).

## Installation[​](#installation "Installation에 대한 직접 링크")

```bash
pip install synapse-sdk        # Local storage only
pip install synapse-sdk[all]   # All cloud providers (S3, GCS, SFTP) + Ray + MCP

```

***

## Available Providers[​](#available-providers "Available Providers에 대한 직접 링크")

| Provider   | Aliases              | Description                   |
| ---------- | -------------------- | ----------------------------- |
| `local`    | `file_system`        | Local filesystem              |
| `s3`       | `amazon_s3`, `minio` | S3-compatible storage         |
| `gcs`      | `gs`, `gcp`          | Google Cloud Storage          |
| `sftp`     | -                    | SFTP servers                  |
| `http`     | `https`              | HTTP file servers (read-only) |
| `synology` | `synology_drive`     | Synology Drive NAS storage    |

***

## Basic Usage[​](#basic-usage "Basic Usage에 대한 직접 링크")

```python
from synapse_sdk.utils.storage import (
    get_storage,
    get_pathlib,
    get_path_file_count,
    get_path_total_size,
)

# Get storage instance
storage = get_storage({
    'provider': 'local',
    'configuration': {'location': '/data'}
})

# Upload a file
from pathlib import Path
url = storage.upload(Path('/tmp/file.txt'), 'uploads/file.txt')

# Check existence
exists = storage.exists('uploads/file.txt')

# Get pathlib object for path operations
path = get_pathlib(config, '/uploads')
for file in path.rglob('*.txt'):
    print(file)

# Get statistics
count = get_path_file_count(config, '/uploads')
size = get_path_total_size(config, '/uploads')

```

***

## Provider Configurations[​](#provider-configurations "Provider Configurations에 대한 직접 링크")

### Local Filesystem[​](#local-filesystem "Local Filesystem에 대한 직접 링크")

```python
config = {
    'provider': 'local',  # or 'file_system'
    'configuration': {
        'location': '/data'
    }
}

```

### S3 / MinIO[​](#s3--minio "S3 / MinIO에 대한 직접 링크")

```python
config = {
    'provider': 's3',  # or 'amazon_s3', 'minio'
    'configuration': {
        'bucket_name': 'my-bucket',
        'access_key': 'AKIAIOSFODNN7EXAMPLE',
        'secret_key': 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY',
        'region_name': 'us-east-1',
        'endpoint_url': 'http://minio:9000',  # Optional, for MinIO
    }
}

```

### Google Cloud Storage[​](#google-cloud-storage "Google Cloud Storage에 대한 직접 링크")

```python
config = {
    'provider': 'gcs',  # or 'gs', 'gcp'
    'configuration': {
        'bucket_name': 'my-bucket',
        'credentials': '/path/to/service-account.json',
    }
}

```

### SFTP[​](#sftp "SFTP에 대한 직접 링크")

```python
config = {
    'provider': 'sftp',
    'configuration': {
        'host': 'sftp.example.com',
        'username': 'user',
        'password': 'secret',  # or use private_key
        # 'private_key': '/path/to/id_rsa',
        'root_path': '/data',
    }
}

```

### HTTP (Read-only)[​](#http-read-only "HTTP (Read-only)에 대한 직접 링크")

```python
config = {
    'provider': 'http',  # or 'https'
    'configuration': {
        'base_url': 'https://files.example.com/uploads/',
        'timeout': 60,
    }
}

```

### Synology Drive[​](#synology-drive "Synology Drive에 대한 직접 링크")

```python
config = {
    'provider': 'synology',  # or 'synology_drive'
    'configuration': {
        'url': 'https://nas.example.com:5001',
        'username': 'admin',
        'password': 'secret',
        'otp_code': '',           # Optional 2FA code
        'base_path': '/team-folder/my-team',
        'verify_ssl': False,      # For self-signed certificates
    }
}

```

Session Management

Synology NAS invalidates previous sessions when a new login occurs with the same account and session name. The SDK automatically handles session re-authentication for both `error_code=119` (session expired) and `error_code=105` (anonymous/permission denied).

SynologyPath Limitations

`SynologyPath` does not support `glob()` or `rglob()`. The SDK automatically falls back to `listdir()`-based file discovery when these methods are unavailable. The `base_path` configuration must start with `/team-folder/`.

***

## Storage Protocol[​](#storage-protocol "Storage Protocol에 대한 직접 링크")

All storage providers implement the `StorageProtocol`:

```python
from typing import Protocol, TYPE_CHECKING
from pathlib import Path

if TYPE_CHECKING:
    from upath import UPath

class StorageProtocol(Protocol):
    def upload(self, source: Path, target: str) -> str: ...
    def exists(self, target: str) -> bool: ...
    def get_url(self, target: str) -> str: ...
    def get_pathlib(self, path: str) -> Path | UPath: ...
    def get_path_file_count(self, pathlib_obj: Path | UPath) -> int: ...
    def get_path_total_size(self, pathlib_obj: Path | UPath) -> int: ...

```

### Custom Storage Implementation[​](#custom-storage-implementation "Custom Storage Implementation에 대한 직접 링크")

```python
from synapse_sdk.utils.storage import StorageProtocol
from pathlib import Path

class MyCustomStorage:
    """Implement StorageProtocol via duck typing."""

    def __init__(self, base_path: str):
        self.base_path = Path(base_path)

    def upload(self, source: Path, target: str) -> str:
        # Upload implementation
        return f"custom://{target}"

    def exists(self, target: str) -> bool:
        # Check existence
        return True

    def get_url(self, target: str) -> str:
        # Get URL for target
        return f"custom://{target}"

    def get_pathlib(self, path: str) -> Path:
        # Get pathlib object for path operations
        return self.base_path / path.lstrip('/')

    def get_path_file_count(self, pathlib_obj: Path) -> int:
        # Count files recursively
        return sum(1 for f in pathlib_obj.rglob('*') if f.is_file())

    def get_path_total_size(self, pathlib_obj: Path) -> int:
        # Calculate total size
        return sum(f.stat().st_size for f in pathlib_obj.rglob('*') if f.is_file())

```

***

## Utility Functions[​](#utility-functions "Utility Functions에 대한 직접 링크")

### get\_storage()[​](#get_storage "get_storage()에 대한 직접 링크")

Create storage instance from configuration.

```python
from synapse_sdk.utils.storage import get_storage

storage = get_storage({
    'provider': 's3',
    'configuration': {
        'bucket_name': 'my-bucket',
        # ...
    }
})

```

### get\_pathlib()[​](#get_pathlib "get_pathlib()에 대한 직접 링크")

Get a pathlib-like object for cloud storage navigation.

```python
from synapse_sdk.utils.storage import get_pathlib

path = get_pathlib(config, '/uploads')

# Iterate files
for file in path.rglob('*.json'):
    print(file.name)

```

### get\_path\_file\_count()[​](#get_path_file_count "get_path_file_count()에 대한 직접 링크")

Count files in a storage path.

```python
from synapse_sdk.utils.storage import get_path_file_count

count = get_path_file_count(config, '/uploads')
print(f"Files: {count}")

```

### get\_path\_total\_size()[​](#get_path_total_size "get_path_total_size()에 대한 직접 링크")

Get total size of files in a storage path.

```python
from synapse_sdk.utils.storage import get_path_total_size

size = get_path_total_size(config, '/uploads')
print(f"Total size: {size} bytes")

```

***

## Migration from v1[​](#migration-from-v1 "Migration from v1에 대한 직접 링크")

### Breaking Changes[​](#breaking-changes "Breaking Changes에 대한 직접 링크")

| v1                                     | v2                          |
| -------------------------------------- | --------------------------- |
| `get_storage('s3://bucket?key=value')` | Dict config only            |
| `FileSystemStorage` class              | `LocalStorage` class        |
| `GCPStorage` class                     | `GCSStorage` class          |
| Subclass `BaseStorage`                 | Implement `StorageProtocol` |

### Provider Aliases (Backwards Compatible)[​](#provider-aliases-backwards-compatible "Provider Aliases (Backwards Compatible)에 대한 직접 링크")

| Alias                | Maps To                        |
| -------------------- | ------------------------------ |
| `file_system`        | `local` / `LocalStorage`       |
| `gcp`, `gs`          | `gcs` / `GCSStorage`           |
| `amazon_s3`, `minio` | `s3` / `S3Storage`             |
| `synology_drive`     | `synology` / `SynologyStorage` |

***

## Examples[​](#examples "Examples에 대한 직접 링크")

### Upload with Progress[​](#upload-with-progress "Upload with Progress에 대한 직접 링크")

```python
from synapse_sdk.utils.storage import get_storage
from pathlib import Path

storage = get_storage({
    'provider': 's3',
    'configuration': {
        'bucket_name': 'ml-models',
        'access_key': '...',
        'secret_key': '...',
    }
})

# Upload model file
model_path = Path('/models/model.pt')
url = storage.upload(model_path, 'models/v1/model.pt')
print(f"Uploaded to: {url}")

```

### Read and Process Remote Files[​](#read-and-process-remote-files "Read and Process Remote Files에 대한 직접 링크")

```python
from synapse_sdk.utils.storage import get_pathlib
import tempfile

config = {
    'provider': 'gcs',
    'configuration': {
        'bucket_name': 'datasets',
        'credentials': '/path/to/creds.json',
    }
}

# Get pathlib object for remote file
remote_path = get_pathlib(config, '/data/dataset.csv')

# Read content directly or save to temp file
with tempfile.NamedTemporaryFile(suffix='.csv', delete=False) as tmp:
    tmp.write(remote_path.read_bytes())
    tmp.flush()

    # Process the file
    import pandas as pd
    df = pd.read_csv(tmp.name)

```

### List and Filter Files[​](#list-and-filter-files "List and Filter Files에 대한 직접 링크")

```python
from synapse_sdk.utils.storage import get_pathlib

path = get_pathlib(config, '/experiments')

# Find all checkpoints
checkpoints = list(path.rglob('*.ckpt'))
print(f"Found {len(checkpoints)} checkpoints")

# Filter by pattern
recent = [f for f in checkpoints if 'epoch_10' in f.name]

```

***

## See Also[​](#see-also "See Also에 대한 직접 링크")

* [Migration Guide](/ko/migration.md) - v1 to v2 migration
* [Installation](/ko/installation.md) - Storage extras installation
* [Network Utilities](/ko/utils/network.md) - Network streaming utilities
