Caching System¶
PyCompTox includes a comprehensive caching system to reduce network traffic, improve performance, and minimize load on EPA's servers.
Overview¶
The caching system provides:
- Automatic caching of all API responses
- Unlimited cache size by default
- Export/import functionality for cache portability
- Cache statistics and management
- Per-request control via
use_cacheparameter - Configurable expiration for cache entries
Quick Start¶
Basic Usage¶
Caching is enabled by default and works automatically:
from pycomptox import chemical
# First call - fetches from API and caches the response
chem = chemical.Chemical()
results1 = chem.search_by_starting_value("caffeine") # API call + cache
# Second call - returns cached response (no API call)
results2 = chem.search_by_starting_value("caffeine") # from cache
Bypassing Cache¶
Use use_cache=False to bypass the cache for specific requests:
from pycomptox import chemical
chem = chemical.Chemical()
# Force fresh data from API
results = chem.search_by_starting_value("caffeine", use_cache=False)
Cache Management¶
from pycomptox import cache_status, clear_cache
# Get cache statistics
status = cache_status()
print(f"Total entries: {status['total_entries']}")
print(f"Cache size: {status['total_size_mb']} MB")
print(f"Endpoints cached: {list(status['endpoints'].keys())}")
# Clear entire cache
cleared = clear_cache()
print(f"Cleared {cleared} cache entries")
# Clear specific endpoint
cleared = clear_cache("chemical/search")
Cache Configuration¶
Default Cache Location¶
By default, cache files are stored in:
- Windows: C:\Users\<username>\.pycomptox\cache\
- macOS/Linux: ~/.pycomptox/cache/
Custom Cache Configuration¶
from pycomptox import CacheManager, set_default_cache
# Create custom cache with expiration
custom_cache = CacheManager(
cache_dir="/path/to/custom/cache",
max_age_days=7, # Expire entries after 7 days
enabled=True
)
# Set as default for all API clients
set_default_cache(custom_cache)
# Or use with specific client
from pycomptox import chemical
chem = chemical.Chemical(cache_manager=custom_cache)
Disable Caching¶
from pycomptox import CacheManager, set_default_cache
# Disable caching globally
disabled_cache = CacheManager(enabled=False)
set_default_cache(disabled_cache)
# Or disable for specific client
from pycomptox import chemical
chem = chemical.Chemical(use_cache=False)
Cache Export and Import¶
Exporting Cache¶
Export cache to share with others or backup:
from pycomptox import export_cache
# Export entire cache to a file
result = export_cache("my_cache_backup.json")
print(f"Exported {result['entries_exported']} entries")
print(f"File size: {result['file_size_mb']} MB")
Importing Cache¶
Import previously exported cache:
from pycomptox import import_cache
# Import cache from file
result = import_cache("my_cache_backup.json")
print(f"Imported {result['entries_imported']} entries")
print(f"Skipped {result['entries_skipped']} existing entries")
# Overwrite existing entries
result = import_cache("my_cache_backup.json", overwrite=True)
Cache Statistics¶
Get detailed information about cache usage:
from pycomptox import cache_status
status = cache_status()
# Basic information
print(f"Cache enabled: {status['enabled']}")
print(f"Cache directory: {status['cache_dir']}")
print(f"Max age (days): {status['max_age_days']}")
# Size statistics
print(f"\nTotal entries: {status['total_entries']}")
print(f"Total size: {status['total_size_mb']} MB")
# Per-endpoint breakdown
print("\nEntries by endpoint:")
for endpoint, count in status['endpoints'].items():
print(f" {endpoint}: {count} entries")
# Time information
print(f"\nOldest entry: {status['oldest_entry']}")
print(f"Newest entry: {status['newest_entry']}")
Advanced Usage¶
Cleanup Expired Entries¶
If using cache expiration, manually trigger cleanup:
from pycomptox import CacheManager
cache = CacheManager(max_age_days=30)
# Remove entries older than 30 days
removed = cache.cleanup_expired()
print(f"Removed {removed} expired entries")
Per-Client Cache Configuration¶
Different clients can use different cache settings:
from pycomptox import chemical, exposure, CacheManager
# Cache for chemical data (7 day expiration)
chem_cache = CacheManager(
cache_dir="~/.pycomptox/cache/chemical",
max_age_days=7
)
# Cache for exposure data (30 day expiration)
exp_cache = CacheManager(
cache_dir="~/.pycomptox/cache/exposure",
max_age_days=30
)
# Use different caches for different clients
chem = chemical.Chemical(cache_manager=chem_cache)
exp_pred = exposure.ExposurePrediction(cache_manager=exp_cache)
Selective Caching¶
Cache some requests but not others:
from pycomptox import chemical
chem = chemical.Chemical()
# Cache this search
results1 = chem.search_by_starting_value("common_chemical", use_cache=True)
# Don't cache this rare search
results2 = chem.search_by_starting_value("rare_chemical", use_cache=False)
# Use default setting (True)
results3 = chem.search_by_starting_value("another_chemical")
Cache Management Best Practices¶
1. Monitor Cache Size¶
Regularly check cache size to prevent disk space issues:
from pycomptox import cache_status
status = cache_status()
# Alert if cache exceeds 1 GB
if status['total_size_mb'] > 1024:
print("Warning: Cache size exceeds 1 GB")
print("Consider clearing old entries or setting expiration")
2. Set Appropriate Expiration¶
For data that changes frequently, use shorter expiration:
from pycomptox import CacheManager, set_default_cache
# Expire entries after 7 days
cache = CacheManager(max_age_days=7)
set_default_cache(cache)
# Periodically cleanup
cache.cleanup_expired()
3. Export Cache for Reuse¶
Share cache with team members or across projects:
from pycomptox import export_cache, import_cache
# Export cache
export_cache("shared_cache.json")
# On another machine or project
import_cache("shared_cache.json")
4. Clear Cache Selectively¶
Clear specific endpoints when data might be stale:
from pycomptox import clear_cache
# Clear only chemical search cache
clear_cache("chemical/search")
# Keep other cached data
5. Disable for Development¶
Disable cache during development to always get fresh data:
from pycomptox import CacheManager, set_default_cache
# Development mode - no caching
dev_cache = CacheManager(enabled=False)
set_default_cache(dev_cache)
Performance Benefits¶
Caching provides significant performance improvements:
import time
from pycomptox import chemical, clear_cache
chem = chemical.Chemical()
# Clear cache for fair test
clear_cache()
# First call (no cache)
start = time.time()
results1 = chem.search_by_starting_value("caffeine")
time1 = time.time() - start
print(f"First call (API): {time1:.3f} seconds")
# Second call (cached)
start = time.time()
results2 = chem.search_by_starting_value("caffeine")
time2 = time.time() - start
print(f"Second call (cache): {time2:.3f} seconds")
print(f"Speedup: {time1/time2:.1f}x faster")
Typical results: - First call (API): 0.5-2.0 seconds - Second call (cache): 0.001-0.005 seconds - Speedup: 100-1000x faster
Cache File Format¶
Cache entries are stored as JSON files with the following structure:
{
"timestamp": "2025-11-18T10:30:00.123456",
"endpoint": "chemical/search",
"params": {
"name": "caffeine"
},
"response": {
"dtxsid": "DTXSID0020268",
"preferredName": "Caffeine",
"casrn": "58-08-2"
}
}
This format allows: - Easy inspection of cached data - Manual editing if needed - Version control friendly (when exported)
Troubleshooting¶
Cache Not Working¶
Check cache status to diagnose issues:
from pycomptox import cache_status
status = cache_status()
if not status['enabled']:
print("Cache is disabled")
elif status['total_entries'] == 0:
print("Cache is empty - first requests will populate it")
else:
print(f"Cache working: {status['total_entries']} entries")
Cache Directory Permissions¶
If cache fails silently, check directory permissions:
from pycomptox import cache_status
from pathlib import Path
status = cache_status()
cache_dir = Path(status['cache_dir'])
if cache_dir.exists():
if not cache_dir.is_dir():
print("Cache path exists but is not a directory")
elif not os.access(cache_dir, os.W_OK):
print("No write permission for cache directory")
else:
print("Cache directory does not exist - will be created on first use")
Corrupted Cache Entries¶
The cache system automatically removes corrupted entries:
from pycomptox import CacheManager
cache = CacheManager()
# This will return None and delete corrupted file
result = cache.get("endpoint", {"param": "value"})
Large Cache Size¶
If cache grows too large:
from pycomptox import cache_status, clear_cache, CacheManager, set_default_cache
status = cache_status()
print(f"Cache size: {status['total_size_mb']} MB")
# Option 1: Clear entire cache
clear_cache()
# Option 2: Clear specific endpoints
clear_cache("chemical/search")
# Option 3: Enable expiration for future entries
cache = CacheManager(max_age_days=30)
set_default_cache(cache)
cache.cleanup_expired()
API Reference¶
For complete API documentation, see: - Cache API Reference