FAQ

Frequently asked.

Honest answers about scope, architecture, and access.

Genematon is your autonomous ML engineer. Connect your data, describe the task, and Genematon creates and owns the full ML pipeline — data cleaning, table flattening, feature engineering, architecture selection, and training. Once the solution is created, it can be deployed on demand behind a real-time REST endpoint or a batch file processor.

Scope is precise: production ML — classification, regression, forecasting, and text understanding. We don't claim to do everything; we ship what we claim to ship.

Two things, in this order.

1. Genematon improves itself. Pointed at its own codebase, the platform's debug success rate went from 30% to 80%. The same engine that designs your pipelines is the engine we use to improve Genematon. Recursive self-improvement, in production.

2. Genematon won a Kaggle competition outright. Genematon autonomously built a supply chain forecasting pipeline that outperformed specialized, handcrafted solutions from human experts.

Traditional AutoML picks from a fixed library of predefined models, basic feature engineering, and hyperparameter tuning. Genematon generates custom architectures and training code, exploring solutions AutoML can't reach.

It also handles multi-table data and text natively, where most AutoML tools require you to flatten everything into a single table first.

Production ML across four workload types: classification, regression, forecasting, and text understanding.

  • Forecasting — demand, energy load, dynamic pricing, predictive maintenance.
  • Classification & Risk — fraud, credit risk, churn, LTV.
  • Text Understanding — RAG over enterprise documents, intelligent document processing, classification on unstructured text. Foundation models are used within these objectives — we don't claim to train new foundation LLMs from scratch.

LLMs are exceptional at reasoning and text generation, but they cannot natively forecast demand, detect fraud in structured data, or calculate customer churn.

By providing an AI agent API for machine learning, Genematon bridges this gap. When you connect your agents via our MCP server, you empower them to autonomously trigger model training, evaluate datasets, and query predictive endpoints. This unlocks true agentic AI for data science, allowing your agents to use quantitative ML models as tools in their reasoning loop.

Two ways, both first-class: an MCP server for agentic systems that prefer the tool-call pattern, and a plain REST API for any service in any language. Same operations, same response shapes — pick whichever fits your stack.

In either case, your code calls Genematon when it needs a trained model. We provision compute, run the full pipeline (data cleaning, feature engineering, architecture selection, and training), and deploy it on demand. Your code handles reasoning and orchestration; Genematon handles the ML.

Each agent or service authenticates with scoped credentials and every action is auditable. See the Developers page for executable examples in both transports.

Training, feature engineering, and deployment are the wrong work to put inside an agent's reasoning loop or your service's request handler. They're long-running, compute-heavy, and stateful — none of which agent runtimes or app servers handle well.

Calling Genematon as a service (over MCP or REST) lets your code issue a single, well-defined call and continue working while the pipeline runs out-of-band. It also draws a clean ownership line: your code owns reasoning, planning, and orchestration; Genematon owns data cleaning, feature engineering, architecture selection, and training, followed by deployment and monitoring on demand.

Your data, your boundary. Genematon runs on Kubernetes. Enterprise clients have the option to deploy a dedicated tenant inside their own environment — where data, compute, and pipelines never leave your perimeter — or use our fully managed hosted service. Developer-tier clients run on our shared hosted infrastructure with strict isolation.

  • Encryption: All data is encrypted in transit and at rest.
  • Access controls: Role-based permissions ensure only authorized users (or scoped agents) can access your data.
  • Isolated execution: Generated code runs in ephemeral, isolated containers that are destroyed after each run, preventing data persistence between executions.
  • Network security: Containers have strict egress filtering and can only communicate with pre-approved services.

The reasoning engine inside Genematon — the part that designs architectures, generates training code, evaluates models, and manages deployments — runs on open-source LLMs. No closed-API vendor sits inside the core system.

No deprecation calendars. No silent retraining. No surprise pricing changes inside the platform you depend on. We're not a maximalist "everything is open-source" platform — we're open-source where it matters most.

Pipeline generation typically takes a few hours, though this varies based on data complexity and the modeling objective. Genematon automates the architecture-selection, code-generation, evaluation, and deployment work that would normally take a team of ML engineers weeks or months.

Yes. Genematon works with multiple tables — related or independent — without requiring you to manually join or flatten them.

CSV, Excel, JSON, and Parquet uploads. Cloud storage (AWS S3, Azure Blob), data warehouses, and APIs.

Yes. Monitoring, drift detection, automatic retraining (when data is available), and pipeline versioning are configured at deploy time. You can also roll back or switch to an alternate generated solution from the same pool.

The biggest failure point in machine learning is models dying in Jupyter notebooks. Genematon eliminates this by treating production as the default state.

When a model is trained, Genematon automatically handles the MLOps: containerization, API endpoint generation, and deployment to a scalable Kubernetes cluster. It also configures data drift monitoring and automatic retraining pipelines out-of-the-box, ensuring your models stay accurate in the real world without requiring a dedicated DevOps team.

Still have questions?

Or reach out directly at