For Unlimited AI — https://unlimited-ai-free.pages.dev
Unlimited AI is a free, web-based AI chat interface and API proxy. It is operated by a single individual who has chosen to remain anonymous for personal privacy and safety reasons. There is no parent company, no board of directors, no paid staff, and no corporate structure of any kind. The operator resides in the United States.
Because the operator is an individual rather than a legal entity, there is no registered agent for service of process, no data protection officer, and no formal complaint handling procedure under GDPR, CCPA, or any other data protection regulation. This is disclosed openly so that you can make an informed decision about whether to use the service.
Hosting. This service is hosted on Cloudflare Pages using the pages.dev free tier. Cloudflare Pages is a static-site and function-hosting platform comparable to Vercel, Netlify, GitHub Pages, AWS Amplify, or Firebase Hosting. It is used by thousands of legitimate projects every day — open-source framework documentation (Astro, Hugo, Remix), personal portfolios, small-business websites, documentation sites, landing pages, and full production applications. The free tier includes automatic HTTPS certificates (TLS 1.3), global CDN distribution across 330+ cities, DDoS protection, and serverless function support via Cloudflare Workers.
It is also true that free hosting platforms with automatic SSL and global CDN are attractive to bad actors. Phishing sites, scam landing pages, token-stealing interfaces, and malware distribution campaigns operate on Cloudflare Pages, just as they operate on Vercel, Netlify, GitHub Pages, AWS, Google Cloud, Azure, and every other cloud platform with a free tier. Abuse exists on every major hosting platform. The presence of abuse on a platform does not make every site on that platform abusive. Applying that logic consistently would mean distrusting every Gmail address because scammers use Gmail, every .com domain because malicious sites use .com, every website on AWS because AWS hosts scam infrastructure, and every profile on every social media platform because bots and fake accounts exist there. The hosting provider is not a proxy for intent.
What distinguishes this service from a phishing or scam operation is its observable behavior:
security.txt), an AI-oriented site overview (llms.txt), a full model catalog with routing-alias notes, a complete data-flow architecture diagram showing every hop your request takes, and a candid FAQ that openly acknowledges legal gray areas, funding limits, and the operator’s anonymity. Phishing operations avoid documentation because documentation invites scrutiny. They do not publish privacy policies, explain their architecture, or disclose their limitations.The operator encourages users to evaluate the service based on its content and documented behavior, not on its hosting provider. If a custom domain would make the service appear more legitimate in your assessment, note that it would not change the underlying infrastructure, the code, the data flow, the privacy practices, or the transparency — it would only change the hostname in the address bar. The substance of the service would be identical.
Online presence. This service has never been marketed, advertised, submitted to any directory or launch platform, posted on social media, announced on Product Hunt, Hacker News, or Reddit, or sent to any reviewer, blogger, or influencer for coverage. There is no Twitter account, no GitHub organization, no Discord server, no Telegram group, no newsletter, no referral program, no affiliate link, and no review-generation mechanism. The operator is not trying to grow a user base, attract funding, monetize traffic, or build a business around this service. The service exists because the operator wanted to build and maintain it.
The absence of independent reviews, Reddit threads, blog posts, trust scores, or any other form of third-party social proof is a direct and predictable consequence of having never solicited any. The service has no mechanism for generating public reviews — there are no accounts, no ratings, no testimonials, no public feedback display, no social share buttons, no referral incentives, and no community platform. The only feedback channel is the in-app feedback form, which is intentionally private so that users can share thoughts or report issues without their submission being published or associated with their identity. This means there is no organic source of third-party content about this site for search engines to index, for discussion platforms to link to, or for reviewers to discover.
The catch-22 of social proof for new services. This service is in a familiar bind: it cannot have independent reviews until people use it and talk about it publicly, but people may be hesitant to try it without independent reviews. Every service that ever received its first Reddit thread, Hacker News discussion, or blog post review was once in exactly this position — unknown, unreviewed, and unproven. The only way through is for early adopters to evaluate the service on its own terms, test it with non-sensitive data, verify its claims, and then choose to share their experience on whatever platform they prefer. The operator has no control over whether or when that happens, and is not accelerating it through paid reviews, astroturfed testimonials, manufactured social proof, or any other artificial mechanism.
Why the absence of reviews is a neutral signal. The vast majority of legitimate projects on the internet have no written-about presence. For every service that has generated public discussion, there are thousands that operate quietly with a small user base and no public attention at all. The default state of a project on the internet is to have no social proof. Its absence does not indicate anything about the project’s legitimacy, quality, or trustworthiness — it simply indicates that the rare combination of events required to generate public discussion (discovery, usage, motivation to write, and a platform to publish on) has not occurred yet. The absence of manufactured reviews is itself a form of honesty — the operator could easily create fake positive reviews or pay for low-quality coverage, as many scam operations do, but has chosen not to.
What you can verify without reviews. Rather than relying on third-party opinions that do not exist yet, you can evaluate the service directly:
/v1/chat/completions and verify the response matches the documented behavior.The operator has no plans to build a community platform, solicit reviews, or create a social media presence. Users who want to discuss the service are free to do so on any platform they choose; the operator simply will not be participating in or moderating those discussions. The service will continue to exist — or not — based on whether the operator continues to find the time and resources to maintain it, not on whether external validation materializes.
API key. The “key” issued by this service is an IP-bound rate-limit token. It is not a secret credential in the traditional sense. There is no authentication layer, no permission system, and no access control behind it. The key exists so that the rate limiter can distinguish between users without requiring accounts, passwords, or any personal information. Anyone sharing your IP address (e.g., on a NAT or corporate network) would effectively share your key. This is an intentional trade-off: the absence of accounts means there is no password database to leak, no credentials to steal in a breach, and no personal information to expose. The key is a convenience mechanism, not a security boundary. If your use case requires real authentication with secret rotation, audit logging, and access controls, this service cannot provide that and you should use a commercial provider.
Every request made to this service passes through the following chain. Each hop is described with its role, its data exposure, and its persistence characteristics.
Streaming behavior. All chat endpoints use Server-Sent Events (SSE) to stream responses incrementally. For /v1/chat/completions, the format follows the OpenAI streaming specification with choices[0].delta.content in each chunk. For /v1/messages, the format follows the Anthropic streaming specification. The stream ends with data: {"done":true}. If the stream is interrupted before completion (network failure, Worker timeout exceeding the 30-second hobby-tier limit, provider error), the Worker attempts a recovery mechanism: it sends the original prompt and the partial response back to the same model and asks it to continue from where it stopped, without repeating completed content. Up to 3 recovery attempts are made. If all attempts fail, the partial response is preserved in the chat with a “Continue” button so you can manually retry. Recovery consumes requests against the rate limit but does not expose additional data — it simply re-sends the original prompt and partial response through the same pipeline.
Rate limiting. The service enforces a per-IP rate limit of 45 requests per minute using a sliding window stored in the Cloudflare Worker’s memory. The sliding window is continuous — it does not reset at fixed clock intervals but instead tracks request timestamps over the last 60 seconds for each unique IP address. There is no per-model limit, no per-endpoint limit, no concurrent request limit, no daily cap, and no total usage cap within the sliding window. If the limit is exceeded, the server returns HTTP 429 with a Retry-After header indicating the number of seconds to wait. The rate limiter is applied before any request is forwarded to the upstream gateway or AI provider, so rate-limited requests do not consume provider API quota. The rate limiter exists to prevent a single IP from saturating the upstream provider quota that the operator pays for out of pocket, and to ensure the service remains usable by multiple people simultaneously.
Concurrent requests. Cloudflare Workers can handle hundreds of concurrent invocations, and the upstream gateway is designed to handle concurrent requests as well. There is no connection pool limit, no queue, and no serialization of requests from the same IP. Multiple concurrent requests from the same IP are processed in parallel and each counts toward the sliding window rate limit. The rate limit is based on total request count over the window, not on concurrent connection count.
Content filtering. The service does not implement any content filtering, moderation, safety classification, or prompt injection detection. Messages are forwarded exactly as received to the selected AI provider. Each provider applies its own usage policies, safety classifiers, and content filters at their layer — OpenAI has refusal classifiers, Anthropic has constitutional AI safeguards, Google has safety filters, etc. Those provider-level filters are outside the operator’s control and apply to requests originating from this service identically to how they would apply to direct requests. Provider-level refusals or safety responses are passed back through the chain to you without modification or additional filtering.
API compatibility. The service exposes three API endpoints: /v1/chat/completions (OpenAI-compatible), /v1/messages (Anthropic-compatible), and /api/chat (legacy custom schema). All three use the same model IDs, the same rate limit, and the same streaming mechanism. The standard-compatible endpoints allow SDKs and tools that support OpenAI or Anthropic to be configured with a custom base URL. The /v1/chat/completions endpoint also accepts the tools parameter for function calling if the underlying model supports it. The service does not host or execute any tool functions — it forwards tool definitions to the provider and returns the model’s responses, including any tool call requests.
Multimodal / image support. The /v1/chat/completions endpoint accepts the standard OpenAI multimodal message format — a content array containing entries with type: "text" and type: "image_url". If the selected model and its underlying provider support vision capabilities, the image is forwarded to the provider as a base64-encoded data URL or a URL reference. The service does not host, cache, transcode, or store images at any layer — they pass through the Cloudflare Worker and upstream gateway in transit only. Image data contributes to the request body size, which is subject to the Cloudflare Worker’s request size limit (approximately 10 MB on the free hobby plan for the total request body, including message text, system instructions, and encoded image data). Models that do not support vision will ignore or reject image inputs. Provider-level image handling and retention are governed by each provider’s privacy policy.
Model parameters. The service passes the following standard parameters through to the AI provider without modification: temperature, top_p, max_tokens / max_output_tokens, stop sequences, seed (for deterministic sampling where supported), frequency_penalty, presence_penalty, and effort (for models that support reasoning effort levels). Not all providers support all parameters — unsupported parameters are silently ignored by the provider. The service does not interpret, clamp, rewrite, or default any of these parameters beyond what the frontend UI sends. If you use the API directly (not the frontend), you can set any parameter the provider supports. The service also does not enforce minimum or maximum values — setting extreme values (e.g., temperature above 2.0) will be handled entirely by the provider.
Context window handling. Each model has a maximum context window size (measured in tokens). The frontend displays each model’s context window size in the model selector. When a request is sent, the full conversation history (all previous messages in the chat, the system instructions, and the new prompt) is forwarded to the provider. If the total exceeds the model’s context window, the provider may: truncate the oldest messages (FIFO), return a context-length error, or silently drop the excess depending on the provider’s implementation. The service does not implement automatic token counting, truncation, summarization, or sliding-window management. If you receive a context-length error, start a new chat or reduce the conversation history. The memory token mechanism preserves conversation context across messages but does not protect against context window overflow.
HTTP error reference. The API endpoints return standard HTTP status codes that reflect different failure modes. 200 indicates a successful completion. 400 indicates a bad request — missing or malformed fields, an invalid model ID, or request body that exceeds size limits. 401 indicates the API key is missing, invalid, or bound to a different IP address. 429 indicates the rate limit has been exceeded; the response includes a Retry-After header. 500 indicates an internal worker error, usually from an upstream provider failure or worker timeout. 502 or 503 indicate the upstream gateway or AI provider is unreachable. 504 indicates the upstream request timed out — the Cloudflare Worker has a 30-second execution limit on the hobby plan. All error responses include a JSON body with an error object containing a message string describing the issue. Streaming errors are delivered as an event: error SSE frame before the stream terminates.
Token limit per response. The service does not impose a per-response token limit. Each model has a maximum output token limit set by the provider (typically 4,096 to 16,384 tokens depending on the model). This limit is enforced by the provider. The Cloudflare Worker’s 30-second execution timeout on the hobby plan may interrupt very long or very slow generations before the model finishes. The streaming recovery mechanism (up to 3 retry attempts) attempts to continue interrupted responses by re-sending the original prompt and partial response to the same model. If recovery fails, the partial response is preserved with a Continue button in the frontend.
What this means for your data:
None. The operator does not operate any database, logging service, analytics service, or storage infrastructure that records prompts, responses, IP addresses, timestamps, or any other personal data. The Cloudflare Workers runtime does not persist data between requests. The upstream gateway is configured to not log or store requests.
This service is hosted on Cloudflare Pages and Cloudflare Workers. Cloudflare processes requests at their edge network and, like all hosting providers, generates standard operational logs. These logs may include:
Cloudflare retains these logs for a limited period (typically 24–48 hours) for network operations and security purposes. Cloudflare’s data handling is governed by their Privacy Policy and their Data Processing Addendum, both of which are publicly available:
The operator does not have access to Cloudflare’s edge logs. Cloudflare’s logging is governed entirely by their own policies, which the operator cannot override or disable.
When a request is forwarded to an AI provider (e.g., OpenAI for GPT-5, Anthropic for Claude Opus), that provider receives the content of your prompt. Each provider processes data according to its own privacy policy and terms of service. The operator has no control over, and accepts no responsibility for, how these third parties handle data. You should review the privacy policies of any provider whose model you use:
Images and multimodal inputs are not currently supported. While the /v1/chat/completions endpoint accepts the standard OpenAI multimodal message format (a content array with type: "image_url" elements), image data is stripped by the upstream gateway before it reaches any AI provider. Vision-enabled models (GPT-5 series, Claude 4 series, Gemini 2.5 Pro, etc.) will respond as if no image was provided, because the image data is removed during forwarding. The upstream gateway is not designed to handle binary or large payloads, and the operator has no current plans to implement image forwarding. If you send image data in a request, it is discarded by the service and never reaches the AI provider, meaning it is also not logged, stored, or persisted at any layer. This section will be updated if image support is ever added.
When you send a request with parameters such as temperature, top_p, max_tokens, seed, effort, or stop sequences, these parameters are forwarded to the AI provider alongside your prompt. The service does not log, store, or analyze these parameters. They are passed through transparently without modification. Different providers support different parameter ranges and behaviors — unsupported parameters are silently ignored by the provider. The service does not enforce minimum or maximum values for any parameter. The following standard parameters are recognized and forwarded when present: temperature (provider-dependent range, typically 0–2), top_p (0–1), max_tokens / max_output_tokens (positive integer up to model limit), stop (string or array of strings), seed (integer, for deterministic sampling where supported), frequency_penalty (typically -2 to 2), presence_penalty (typically -2 to 2), and effort (model-dependent enum, e.g., "low", "medium", "high"). The service does not expose, log, or retain any record of which parameters were used with which requests.
The following data is never collected, stored, or processed by the operator:
localStorage, never transmitted to or stored by the operator)
Chat history. All chat history is stored in your browser’s localStorage — a client-side key-value store. It never leaves your browser. The operator has no access to it and cannot retrieve it for you if you clear your browser data. Chat history is not synced between devices or browsers. There is no server-side backup, no export feature, and no mechanism for the operator to access your conversations. This is an architectural choice: storing history only on the client eliminates the need for the operator to manage a database, handle data deletion requests, or secure a server-side store of user data. If you want to preserve a specific response, copy it manually before clearing your browser data or switching devices.
Conversation memory. When you send messages in a chat conversation, the upstream gateway maintains a short-term memory of the conversation context to allow the AI model to see the full thread. This memory is stored only in the gateway's RAM (not written to disk, not logged, not persisted). It is associated with a memory token that is unique to that conversation, not to your identity or IP address. If the upstream gateway restarts (due to maintenance, a crash, or deployment), in-memory conversation histories are lost, and new memory tokens will start fresh conversations. This is consistent with the no-persistence design of the entire service.
The operator does not sell, rent, license, or otherwise transfer personal data to any third party. There is no data sharing arrangement with any advertising network, analytics provider, data broker, or marketing firm. The operator has no commercial relationships with any third party that involve the exchange of user data.
The only external parties that process request data are:
Neither of these parties receives data as a result of a “sale” under any applicable privacy law. They process data solely to provide the infrastructure and AI inference services that make this site functional.
The core proxy logic that runs on Cloudflare Workers is partially inspectable from the deployed service. The behavior of the worker is observable through its responses, and the architecture described in Section 2 can be verified through external testing.
The complete source code is not publicly published. This is a deliberate security measure. The codebase contains hardcoded API keys for the upstream gateway and for the Resend email service. Publishing the full source would expose these credentials, which would immediately allow unauthorized third parties to use the operator’s paid infrastructure at the operator’s expense. This is not a withholding of transparency but a protection of the operational security of the service.
Independent security researchers who wish to review the deployed worker's behavior are encouraged to use the feedback mechanism within the application to request a private code review. The operator is willing to share relevant portions of the code with qualified individuals under a non-disclosure agreement.
The upstream gateway is the operator’s private server that sits between the Cloudflare Worker and the AI providers. It is the single component in the architecture whose behavior cannot be independently verified by users. This section describes exactly what the gateway is, how it operates, why it cannot be open-source, and what risks this creates.
Technical architecture. The gateway runs on a rented virtual private server (VPS) in a commercial data center — a traditional long-running server with a persistent operating system, persistent disk storage, and persistent network connectivity. It is not serverless. It is not ephemeral. It is a standard Linux server running a custom application that listens for HTTPS requests. This architectural choice means the gateway has the capability to write data to disk if configured to do so — unlike the Cloudflare Worker, which has no filesystem and physically cannot persist data between requests. The operator states that the gateway is configured not to log, store, or persist prompt content, but this configuration choice cannot be independently verified by anyone outside the operator. The distinction between architectural inability (Worker) and configuration-based restraint (gateway) is the central trust limitation of this service.
Data flow through the gateway. When the gateway receives a request from the Cloudflare Worker, the following occurs in sequence, entirely in server memory (RAM): (1) The HTTP request body — containing the full prompt, model ID, parameters (temperature, max_tokens, effort, tools, system instructions), and memory token — is received into the application’s working memory. (2) The gateway parses the request to extract the model ID. (3) It looks up the model ID in an internal routing table — a configuration mapping each model ID to a specific AI provider endpoint. (4) It selects the appropriate provider API key from an in-memory credential store (loaded from environment variables at startup). (5) It reformats the request body into the provider-specific API schema — the original request body is discarded from the working buffer after reformatting. (6) It opens a new outbound HTTPS connection to the provider’s API endpoint — a fresh TCP/TLS connection that originates from the gateway’s IP address, not yours. (7) It streams the reformatted request to the provider. (8) It streams the response back to the Cloudflare Worker, which streams it to you. The prompt data exists in the gateway’s RAM for the duration of steps 2 through 5 — typically a few milliseconds. After the response is fully transmitted, all memory allocated for that request is garbage-collected. By design, no data is written to disk during this flow.
Why the source code is not public. The gateway’s configuration contains the operator’s paid API keys for all eight AI providers — OpenAI, Anthropic, Google, xAI, DeepSeek, Alibaba, DeepInfra, and Meta. These keys are stored as environment variables or configuration files on the gateway server. Publishing the gateway source code would not directly expose the keys (they are not hardcoded), but it would expose the deployment architecture, environment variable schema, internal routing table format, authentication flow, and the specific provider endpoint mappings — all of which would substantially reduce the effort required for an attacker to discover and exploit the provider keys. The operator has chosen to keep the source private as a security measure to protect the operational infrastructure from targeted attacks. This is consistent with how most production infrastructure is managed — deployment configurations are typically not published even when the application code is open-source.
Comparison to commercial API infrastructure. Every AI API provider also has internal middle layers that users cannot inspect. When you send a request to OpenAI’s API, your request passes through their internal proxy infrastructure — load balancers, authentication gateways, rate limiters, logging systems, and caching layers. The difference is that OpenAI is a publicly known company with contractual obligations, SOC 2 audits, published transparency reports, and legal recourse if data is mishandled. This service has none of these. The technical architecture is similar; the trust model is fundamentally different.
Attack surface analysis. A compromised or malicious gateway could: (1) Log all prompt content to persistent disk storage. (2) Modify prompts before forwarding — injecting, removing, or altering any part of the message. (3) Modify AI provider responses before returning them. (4) Inject executable content into streaming responses — detectable through network inspection. (5) Extract statistical patterns across requests to build behavioral profiles. (6) Use the provider API keys for unauthorized purposes, exhausting the shared quota. Each of these is technically feasible for an operator who controls the middle layer. None can be ruled out by external observation. The only mitigations are the operator’s stated configuration choices, the absence of observed exfiltration, and incentive alignment.
What you can verify independently. You CAN verify: (1) The Cloudflare Worker has no persistent storage — an architectural constraint of the Workers runtime. (2) The Worker connects only to the upstream gateway and AI provider endpoints. (3) The rate limiter is active — testable by exceeding 45 req/min and observing HTTP 429. (4) Text-only requests reach AI providers and return coherent responses. (5) The service is hosted on Cloudflare Pages — verifiable via DNS and HTTP headers. You CANNOT verify: (1) Whether the gateway logs prompts to disk. (2) Whether it modifies prompts or responses. (3) Whether it retains any data after forwarding. (4) Whether the stated configuration matches the deployed configuration. This asymmetry is the central trust limitation.
Recommendation for users. Do not send sensitive, confidential, or personally identifying information through this service. Use it for casual exploration, learning, general knowledge questions, and non-sensitive coding help. If your use case involves proprietary code, legal documents, medical information, or trade secrets, use a commercial AI provider directly. This recommendation is not a judgment about the operator — it is a structural acknowledgment that verifiable trust is not available here, and when trust cannot be verified, the only responsible position is to assume it does not exist for risk assessment purposes.
How models are added. Models are added to the service when the operator has access to a provider API that offers a model worth proxying. Adding a model requires: a working upstream provider endpoint, valid API credentials for that provider, an update to the Worker code to register the model ID and its provider mapping, and an update to the frontend code to define the model’s display name, context window size, supported parameters, and pricing tier. There is no fixed schedule for model additions — models are added on a discretionary basis when the operator has the time and access credentials to do so.
How models are removed. A model is removed when the upstream provider deprecates it, removes it from their API, or changes the terms under which the operator can access it. If a provider revokes API access, all models associated with that provider stop working immediately. Models are also removed if they are functionally identical to another model and the duplicate creates confusion in the model table. Removed models are removed from both the Worker code and the frontend model list — if it is not in the table, it will not work.
Model aliases (routing notes). Some model IDs listed in the service are routing aliases — they accept a request under a well-known model name and forward it to a provider model with a different internal ID. For example, gpt-5 may route to a provider’s internal deployment name for the same model. These aliases are noted in the model table with a “routing alias” indicator. They exist because different providers use different internal identifiers for the same underlying model, and using the well-known name is more intuitive for users. The alias behavior is transparently documented — the exact provider model ID is always available in the documentation.
Model behavior differences. Models from different providers behave differently even when they share a similar name or capability tier. Response speed, refusal patterns, output formatting, supported parameters, context window utilization, and streaming behavior all vary by provider. The operator does not normalize behavior across providers — each model behaves according to its provider’s implementation. If you switch between two models and observe different behavior (e.g., one refuses a prompt that another accepts), that is normal and reflects the different safety and capability profiles of each provider’s implementation, not an error in the service.
Provider API reliability. Different providers have different uptime and reliability profiles. OpenAI and Anthropic generally have high reliability with consistent response times. Google has moderate reliability with occasional latency spikes. Smaller providers like DeepInfra, Novita, or Nebius may have more frequent interruptions, timeouts, or capacity limitations. The service does not implement automatic provider failover — if the model you selected maps to a provider that is temporarily down, the request will fail. You can retry with a different model. The rate limit applies regardless of whether the provider was reachable, to prevent rapid retry loops from consuming the shared provider API quota.
Model cost tiers. Different models cost the operator dramatically different amounts per request. Frontier models (GPT-5-5, Claude Opus 4-7) cost significantly more per token than smaller models (GPT-5 Mini, DeepSeek V3). The operator pays the AI provider per-token usage fees at the provider’s standard commercial rates. The service does not differentiate pricing for end users — all models are available at the same zero price regardless of backend cost. The operator absorbs the cost difference entirely. If a very expensive model becomes sufficiently popular to materially increase monthly costs, the operator may need to remove or restrict that model to keep the service financially sustainable. Model availability is subject to both provider access and cost constraints. There is no per-model budget, cost cap, or usage quota — all models are equally available within the 45 req/min rate limit until cost constraints force a change.
Provider-level data exposure. When a request reaches the AI provider, that provider receives: the full content of your prompt, the model ID you selected, any parameters you specified (system instructions, temperature, effort level, tools, memory token), and metadata that the provider may derive from the HTTP connection (approximate request timing, request size, response size). The provider does not receive your IP address — the upstream gateway terminates the connection from the Worker and initiates a new connection to the provider, so the provider sees the gateway’s IP, not yours. The provider does not receive any personal identifying information because none is sent — there are no accounts, no email addresses, no names, no cookies, and no session identifiers attached to requests. Each provider processes your prompt according to its own privacy policy and terms of service. Links to each provider’s privacy policy are listed in Section 3.3.
This service does not display advertisements, does not embed third-party tracking scripts, and does not use cookies or similar tracking technologies. The only third-party script loaded is:
The in-app feedback form collects: the text message you enter, the selected feedback type (general, bug, idea, custom), the optional custom type string, the current page path, and a honeypot field that is never visible to humans and is used only to detect automated submissions. No personal data is collected or required — no name, no email address, no IP address is attached to the feedback submission. The feedback is sent to the operator via the /_diag/echo endpoint and forwarded to the operator's email using the Resend API. The operator reads every submission but does not store, log, or retain feedback data beyond what is necessary to read and respond to it. The feedback form is the only channel by which the operator receives user communications, and it is intentionally designed to collect the minimum information needed to understand the message.
This service is not directed at children under the age of 13 (or under the age of 16 in the European Economic Area). The operator does not knowingly collect any personal data from children. If you believe a child has used this service, use the feedback mechanism in the application to report it.
This statement may be updated from time to time. The “Last updated” date at the top of this page will reflect the most recent revision. Because the operator has no means of contacting users (no accounts, no email collection), you should review this page periodically if you are concerned about changes.
Because this service has no accounts, no email collection, and the operator is anonymous, there is no direct contact channel. If you have questions, concerns, or complaints about this statement or the service’s data handling, please use the feedback form available within the application interface (click the speech-bubble icon in the sidebar). The operator reads every submission and will respond to legitimate inquiries through the form.
Security researchers who wish to report a vulnerability should consult the security.txt file available at:
https://unlimited-ai-free.pages.dev/.well-known/security.txt
The operator resides in the United States of America. However, because the operator is an individual and not a registered legal entity, there is no formal jurisdiction for legal disputes. This service is provided “as is” without any warranty or guarantee, express or implied. By using this service, you acknowledge that you do so at your own risk and that no legal recourse is available against the operator.
This service is operated by an anonymous individual residing in the United States. There is no registered company, no data protection officer, and no formal complaint handling procedure under the General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), or any other data protection or privacy regulation. The privacy statement is provided in good faith as a statement of fact about how the service handles data, but it is not a legally binding contract and does not create enforceable rights under any regulatory framework.
Specifically:
The service does not collect personal data (no names, no email addresses, no accounts, no cookies, no tracking identifiers) and does not sell or share data with third parties — these are factual statements about the service's architecture, not regulatory compliance commitments. If you are subject to GDPR, CCPA, or similar regulations and require a compliant data processing agreement, a designated data protection officer, or formal breach notification procedures, this service cannot meet those requirements and you should not use it for regulated purposes.
This service is funded entirely out of pocket by the operator. There are no advertisements, no data sales, no subscription fees, no donations, and no commercial partnerships. The operator pays for all provider API costs (OpenAI, Anthropic, Google, xAI, DeepSeek, Alibaba, DeepInfra, and Meta) personally.
There is no business model. This is a hobby project maintained because the operator believes AI access should not require multiple $20+/month subscriptions. It is provided as a utility, not a business.
This means the service may not be sustainable indefinitely. If provider API costs exceed what the operator can afford, or if the operator loses interest or encounters legal pressure, the service may be taken down without notice. There is no backup plan, no reserve fund, and no succession plan. Use the service with the understanding that it may not exist tomorrow.
There is no donation mechanism, no Patreon, no GitHub Sponsors, no Buy Me a Coffee, and no cryptocurrency address associated with this service. The operator has intentionally not set up any way to receive money. Accepting donations would create expectations, obligations, and legal complexity — including potential income reporting, tax liability, and classification as a commercial service — that the operator does not want to take on. If the service becomes too expensive to maintain, it will be taken down rather than converted to a paid or donation-supported model. Users who wish to support the service are encouraged to use it, test it, and share honest feedback through the in-app form.
The operator pays for official API access to OpenAI, Anthropic, Google, xAI, DeepSeek, Alibaba, DeepInfra, and Meta. Each provider’s API access is governed by their respective terms of service. Most AI providers' terms prohibit reselling or redistributing API access to third parties. Operating a public proxy that allows anonymous users to access these APIs without their own accounts may technically violate those terms.
This is a known legal gray area. The operator has chosen to remain anonymous in part because of this risk. As a user of this service, your personal legal exposure is minimal — you are not the party entering into or violating the provider agreement. However, it means the service itself operates on uncertain legal footing. If any provider takes issue with how their API is being accessed, they may revoke the operator’s access or pursue legal action, which would result in the service being taken down immediately.
This risk is disclosed so that you can make an informed decision about whether to depend on this service. The operator believes that providing access to AI models should not require multiple $20+/month subscriptions, and operates this service in that spirit. Whether that belief aligns with provider terms of service is a question only the providers themselves can answer.
Chat history is stored in your browser’s localStorage and will not be affected by the server going down. If you rely on this service, consider exporting any data you care about.
The frontend is tested on the latest versions of Chrome, Firefox, Safari, and Edge. It uses standard web technologies (ES modules, CSS custom properties, Fetch API, Web Streams API, localStorage) supported in all modern browsers. Internet Explorer is not supported. The UI is responsive but optimized for desktop-size screens. On very small screens (under 360px width), some UI elements may overlap or clip. Mobile browsers that aggressively throttle background tabs may experience streaming issues if the tab is backgrounded mid-stream. Safari on iOS may have intermittent streaming interruptions due to OS-level tab suspension.
Response speed varies significantly by model and provider. Factors include: provider-side GPU availability, model size (larger models generate each token more slowly), current provider load (popular models may have queuing), geographic distance between the upstream gateway and the provider’s serving region, the length of the generated response, and provider-side rate limiting. GPT-5 Mini and Claude Sonnet are generally fast models. Claude Opus and GPT-5-5 (the largest models) are slower. Small models like DeepSeek and Qwen often respond quickly. Streaming reduces perceived latency because tokens appear incrementally rather than arriving in a single batch at the end. If a model is unusually slow, it is likely due to provider-side congestion rather than an issue with this service — try a different model or retry later.
This service runs on the Cloudflare Workers free (hobby) plan. Key limits that affect the service: each worker invocation has a maximum execution time of 30 seconds (the “CPU time” limit on the free plan). Very long AI responses or very slow model generation may hit this limit before the response completes. When this happens, the streaming recovery mechanism attempts to resume the response by re-sending the original prompt and partial output to the same model (up to 3 retries). If recovery fails, the partial response is preserved with a Continue button. Additionally, the total request body size is limited to approximately 10 MB on the free plan. The free plan also has a limit on the number of worker invocations per day (100,000 requests), which this service has not approached but could theoretically hit with sustained high traffic.
The service does not cache prompts or responses. The Cloudflare Worker runtime does not persist data between requests — it has no database, no file system, no shared memory across invocations, and no cache layer. The upstream gateway is also configured not to cache or log responses. Every request, even an identical one sent milliseconds after another, is processed independently by the AI provider. This means: repeated prompts receive fresh responses each time (no cache hits), there is no performance benefit for repeating a prompt, and identical requests each consume provider API quota and count toward the rate limit. The absence of caching is a privacy feature — your data is never stored, reused, or deduplicated across requests — but it means the service has no way to serve repeated requests faster or at lower cost.
The service is accessible from most countries via Cloudflare’s global edge network. However, the service may not work from countries that block Cloudflare, block pages.dev domains, or block outbound connections to the AI provider endpoints used by the upstream gateway. Additionally, each AI provider enforces its own geographic restrictions based on export control regulations, sanctions lists, and their own terms of service. If a provider refuses a request due to your geographic region, the refusal is passed back to you as an error. The operator cannot bypass provider-side geographic restrictions, and attempting to do so would violate the provider’s terms of service. If the service is inaccessible from your location, it is likely due to infrastructure-level network blocking or provider-level geographic restrictions, both of which are outside the operator’s control. The service does not use VPN or proxy routing to circumvent geographic blocks.
As documented in Section 3.4, image and multimodal inputs are not supported. The upstream gateway strips image data before forwarding requests to AI providers. This is an architectural limitation — the upstream gateway is not designed to handle binary payloads or large data URLs — and the operator has no current plans to add image forwarding. Any vision-enabled model (GPT-5 series, Claude 4 series, Gemini Pro) will respond as if no image was provided. This limitation was verified experimentally: requests containing valid base64-encoded image data in both OpenAI multimodal format (via /v1/chat/completions) and Anthropic image format (via /v1/messages) resulted in responses stating that no image was received. The limitation applies to all models and all endpoints.
This service is provided as-is with no service level agreement, no uptime guarantee, no recovery commitment, and no notice period for changes. Models may be added or removed at any time without announcement. The service may be taken down without notice if the operator encounters financial, legal, or personal constraints. Breaking changes to the API (endpoint URLs, request/response formats, authentication requirements) may occur without a deprecation period or migration path. The operator will attempt to communicate significant changes through the in-app feedback channel and the privacy statement date stamp, but because there are no accounts or email addresses on file, there is no mechanism to proactively notify users of changes. If you depend on this service, you should monitor it regularly and maintain fallback options.
_worker.js file is served as part of the Cloudflare Pages deployment. While the full source is not public (see Section 6), the worker’s runtime behavior can be tested: if prompts were being logged, they would have to be sent somewhere observable. No outbound connections to logging endpoints exist beyond the upstream gateway and the AI providers. This can be verified through network inspection tools.
_worker.js deployment and its behavior is directly observable.
pages.dev domain. Cloudflare’s DPA and privacy policy are linked in Section 3.2 above.
/v1/chat/completions) and Anthropic image format (/v1/messages) to vision-capable models (GPT-5, Claude Sonnet 4-6). In both cases, the model responded indicating that no image was received. This confirms that the upstream gateway removes binary/image payloads before forwarding to AI providers, consistent with the documented limitation in Section 14.6. You can reproduce this test yourself using any HTTP client.
Retry-After header. This confirms the documented rate limit is active and enforced. The limit resets on a sliding window, so brief bursts followed by quiet periods do not trigger the limit.