Perplexity AI Introduces Hybrid Local-Server Inference Orchestrator for Personal Computer: Automatic On-Device and Cloud Task Routing

MarktechPoststanford agents memory tools

Meet OpenJarvis: A Local-First Framework for On-Device Personal AI Agents with Tools, Memory, and Learning

Stanford researchers released OpenJarvis, an open-source framework that runs inference, agents, memory, and learning entirely on-device. It decomposes a personal AI system into five composable primitives — Intelligence, Engine, Agents, Tools & Memory, and Learning — and lands within 3.2 points of the best cloud model at roughly 800× lower marginal API cost. The post Meet OpenJarvis: A Local-First Framework for On-Device Personal AI Agents with Tools, Memory, and Learning appeared first on MarkTechPost.

Jun 4, 6:23 AM

Crypto Briefingnvidia ai chip personal computer 120 billion parameter models

Nvidia enters personal computer market with new AI chip that can run 120 billion parameter models locally

Nvidia's entry into the PC market with a powerful AI chip could redefine local AI processing, challenging existing tech giants and reshaping user data privacy. The post Nvidia enters personal computer market with new AI chip that can run 120 billion parameter models locally appeared first on Crypto Briefing.

Jun 1, 3:13 PM

MarktechPosthugging face perplexity ai unigram tokenizer tokenizers crate

Perplexity AI Open-Sources Unigram Tokenizer That Achieves 5x Lower p50 Latency Than Hugging Face tokenizers Crate

Perplexity AI open-sources a rewritten Unigram tokenizer that reduces reranker latency and cuts production CPU utilization by 5-6x. The post Perplexity AI Open-Sources Unigram Tokenizer That Achieves 5x Lower p50 Latency Than Hugging Face tokenizers Crate appeared first on MarkTechPost.

May 28, 9:08 AM

Crypto Briefingai infrastructure gpu-as-a-service cloud models roundhill

Roundhill files for Neocloud ETF targeting GPU-as-a-Service infrastructure

The rise of Neocloud ETFs could significantly reshape AI infrastructure investment, emphasizing specialized GPU services over traditional cloud models. The post Roundhill files for Neocloud ETF targeting GPU-as-a-Service infrastructure appeared first on Crypto Briefing.

May 23, 2:38 AM

AI Insiderai agent mac local files perplexity

Perplexity Launches Personal Computer AI Agent for All Mac Users to Rival Local AI Assistants

Perplexity has opened its Personal Computer feature to all Mac users through a new desktop app, bringing local AI agent capabilities beyond its previous Max subscriber waitlist. The tool extends Perplexity’s cloud-based Computer product onto users’ own devices, giving AI agents access to local files, native Mac applications, over 400 connectors, and the web to […]

May 9, 7:55 AM

AI Insiderconversational ai snap snapchat perplexity ai

Snap Ends $400M Perplexity AI Deal as CEO Spiegel Pivots Focus to Intelligent Eyewear

Snap has quietly terminated its $400 million partnership with AI search startup Perplexity, revealing the split as part of its first-quarter earnings report. The deal, announced last November, would have embedded Perplexity’s conversational AI search engine directly into Snapchat’s Chat interface. Despite limited testing with select users, the companies failed to agree on a path […]

May 8, 12:38 PM

TechCrunch AIai agents mac perplexity personal computer

Perplexity’s Personal Computer is now available everyone on Mac

Perplexity's Personal Computer brings AI agents to your Mac, and is now open to everyone.

May 7, 7:57 PM

InfoWorld AIsmall language models slms privacy ai architecture

Small language models: Rethinking enterprise AI architecture

Three key advantages of SLMs Division of labor: Modern AI architecture uses routers to send routine tasks to 7B-parameter SLMs, reserving trillion-parameter LLMs only for complex reasoning. Economic efficiency: For high-volume, repetitive tasks, SLMs can reduce cloud inference costs by up to 90% while providing near-instant latency. Privacy at the edge: Because SLMs can run locally on-device or on-premises, they reduce the data leakage risks inherent in sending sensitive telemetry to the public cloud. Large language models (LLMs) are the workhorses of AI, supporting ever more sophisticated capabilities and workflows, and approaching near-human level performance. But sometimes more isn’t always better — it’s just more. Specialized data and limited capabilities are just fine for some workflows. This realization is driving the evolution of small language models (SLMs), rather than one-size-fits-all LLMs. SLMs — coming in the form of domain-specific models, statistical langua

May 4, 9:00 AM