AI This Week: From Alibaba's Innovations To Microsoft's New Android App

AI This Week: From Alibaba’s Innovations To Microsoft’s New Android App | Exploring the Latest AI Trends and Breakthroughs in Technology

This week’s AI news takes us through a landscape where technology meets real-world application. We’re looking at innovations that span from enhancing creative expression in video and image generation to tackling complex challenges in healthcare and finance.

As we examine these stories, we’ll see how AI isn’t just a tool for tech giants, but a versatile asset across various sectors. So, let’s dive into an insightful exploration of AI’s current state and its tangible impact on the world around us.

AI In Consumer And Business Applications

Apple’s Ferret with Cornell University

Apple, in collaboration with Cornell University, has made a significant stride in AI research with Ferret, an open-source multimodal large language model. Released on GitHub, Ferret stands out for its ability to utilize image regions as queries, identifying elements within images to aid complex queries.

This marks a deviation from Apple’s typically guarded approach, potentially influencing future products. Ferret’s training on Nvidia’s A100 GPUs underlines Apple’s growing commitment to AI advancements, reflecting a new era of openness and contribution to the AI community.

Canva’s Generative AI Integration

Australian graphic design giant Canva has seamlessly integrated generative AI into its platform, significantly enhancing user experience and content creation capabilities. Despite market fluctuations, Canva has maintained robust growth, with an impressive addition of 80 million users over the past year.

This surge is largely attributed to Canva’s focus on local, authentic content and global expansion efforts. The implementation of generative AI not only fosters user growth but also boosts revenue, offering users a diverse array of smart content options.

Canva’s commitment to AI is further cemented by a substantial $200 million investment over three years, aimed at training models with creator content, showcasing minimal resistance from creators.

Apple’s Licensing Talks with News Publishers

Apple is reportedly engaging in negotiations with prominent news publishers to license their content for its generative AI initiatives. According to The New York Times, Apple proposed multi-year agreements, potentially worth over $50 million, to access extensive archives of news articles.

Targeted publishers include heavyweights like Condé Nast, NBC News, and IAC. These talks mark a significant shift in Apple’s strategy, pivoting from its traditional focus on device functionality enhancement to a deeper dive into generative AI technology.

While the proposal has elicited mixed reactions from the publishers, it signifies Apple’s ambition to strengthen its foothold in the rapidly evolving domain of generative AI.

Leonardo AI’s Image-to-Motion Tool

Leonardo AI has introduced a groundbreaking feature that transforms static images into animated motion videos, a notable advancement in the realm of visual content creation.

This tool, especially beneficial for creative professionals, offers a variety of dynamic effects to elevate digital imagery. Designed to enhance the experience of visual storytelling, this tool opens new avenues for users in marketing and design to captivate their audience.

While the transformation costs 25 credits per video, users on the free plan receive daily allowances for creating multiple motion videos.

LG’s Roaming AI Assistant

LG has unveiled a revolutionary roaming AI assistant, a wheeled robot designed to redefine smart home devices. Announced on December 27, 2023, and set to debut at CES in Las Vegas, this self-balancing, two-wheeled robot is equipped with advanced optical and depth-sensing cameras.

It possesses the capability to recognize faces, pets, and objects while monitoring environmental factors like temperature and air quality. The robot offers a range of functionalities, from home patrolling to pet monitoring, security breach detection, and managing household appliances.

It greets users with discernment of their emotions, providing suitable content and combining traditional voice interaction with autonomous mobility and environmental monitoring. LG’s AI assistant represents a significant leap in home automation technology, blending convenience with innovation.

Resemble AI’s Audio Enhancement Tool

Resemble AI has unveiled Resemble Enhance, an open-source AI tool revolutionizing audio clarity. This innovative tool excels in transforming noisy recordings into crystal-clear speech, making it ideal for podcasting, entertainment, and historical audio restoration.

It integrates a denoiser and enhancer to isolate voices from background noise and fix distortions while enhancing speech bandwidth. Focused on continual improvement, Resemble AI aims to refine processing speed and control over speech nuances. Notably, the tool can enhance audio older than 75 years, demonstrating its versatility.

Microsoft’s Dedicated Copilot App for Android

Microsoft has launched a dedicated Copilot app for Android, marking a significant expansion in AI-powered mobile applications. This app, distinct from the Bing mobile app, provides access to Microsoft’s AI Copilot, including chatbot capabilities and DALL-E 3 image generation. It also offers free access to OpenAI’s GPT-4 model.

Originally integrated into Bing as a ChatGPT-like interface, the Copilot now stands as a standalone experience. This launch represents Microsoft’s continued push into AI-driven applications, enhancing user accessibility and interaction with cutting-edge AI technology.

Pika Labs’ AI Video Generator

Pika Labs has released an advanced AI video generator, now available to all users through a versatile web interface. This tool allows for dynamic video creation using images, text, or a combination, featuring options for camera movements and video-text consistency.

Initially designed for Discord, its web version facilitates easy video creation up to 15 seconds. Amidst stiff competition in the prompt-to-video market, Pika Labs distinguishes itself with this user-friendly, innovative tool.

Advancements In AI Models And Technologies

Nvidia’s “Align Your Gaussians” for 3D Animations

Nvidia, alongside the University of Toronto and MIT, has developed “Align Your Gaussians” (AYG), a groundbreaking AI system for creating 3D animations from text descriptions. AYG uses 3D Gaussian functions for object shaping and animation, blending various AI models for realism and motion smoothness.

Its ability to generate lifelike motions and textures from simple text prompts, such as a galloping horse, opens new possibilities in creative tools and synthetic data generation. This advancement in 3D animation technology could revolutionize industries like gaming and autonomous vehicle training.

University of Minnesota’s Breast Cancer Treatment AI

Researchers at the University of Minnesota are pioneering AI in breast cancer treatment, focusing on minimizing chemotherapy-induced heart damage. With a $1.2 million grant, the team, led by Rui Zhang and Assistant Professor Ju Sun, is developing an AI tool to predict cardiac risks.

This project aims to tailor treatment plans and improve patient outcomes. The challenge lies in adapting AI to diverse and limited patient data. Success in this endeavor could extend the application of AI in healthcare, potentially transforming treatment approaches for various cancers and diseases.

Entrupy’s AI Luxury Item Authenticator

Entrupy has developed an AI tool capable of authenticating luxury items, such as handbags and sneakers, with a 99.1% accuracy rate. Popular among vintage resellers, this AI authenticator verifies products from high-end brands.

Using a specialized device, users take detailed photos, and the tool generates an official certificate for authenticated items. Currently, it is limited to major brands but is gaining traction, especially with a partnership with TikTok to identify counterfeit products. Entrupy’s innovation represents a significant leap in the fight against counterfeit luxury goods.

Alibaba’s Make-A-Character and AI Tools

Alibaba has unveiled Make-A-Character (Mach), an advanced text-to-3D model tool that transforms text descriptions into detailed 3D avatars. Currently focusing on Asian ethnicity avatars, Mach aims to incorporate more diversities.

It utilizes large language models and vision foundation models for text-to-visual mapping. The tool allows easy animation of avatars due to their parameterized representation. Alongside Mach, Alibaba introduced Richdreamer for 2D to 3D conversion and enhanced language models like Qwen-72B.

These tools signify Alibaba’s commitment to advancing AI research and applications, particularly in the realm of 3D modeling and animation.

Midjourney’s Training Video Models

Midjourney, led by CEO David Holz, is set to expand its AI capabilities by training video models, building on its advanced image model foundation. This development parallels Meta’s success with video AI and signals Midjourney’s entry into the dynamic field of AI-generated video content.

The upcoming v6 updates aim to enhance text rendering and prompt responsiveness. Midjourney’s move into video model training reflects the growing trend in AI towards creating more complex and interactive media, showcasing the potential for further advancements in AI-driven video technology.

Google’s VideoPoet Large Language Model

Google has introduced VideoPoet, a cutting-edge Large Language Model (LLM) specialized in various video generation tasks. This model excels in text-to-video conversion, video editing, and audio synchronization, addressing challenges in generating seamless large-scale motions.

VideoPoet’s integrated approach, training with multiple tokenizers, sets it apart in the AI-generated video technology landscape. It offers enhanced text fidelity and dynamic motion in videos, distinguishing itself from contemporaries like Imagen Video and Stable Video Diffusion.

Update to Google’s AI Image Generator, Imagen

Google has updated its AI image generator, Imagen, introducing Imagen 2.0, now accessible to Google Cloud customers. Imagen is a text-to-image AI model that creates photorealistic images from textual descriptions.

The update brings improvements like enhanced image-caption understanding and advanced image-generation techniques. Imagen 2.0’s improved training dataset enables more accurate image generation, reflecting a deeper understanding of the relationship between images and words.

These enhancements in Imagen 2.0 mark a significant advancement in AI image generation, offering more realistic and contextually accurate visual outputs based on textual prompts.

AI In Legal, Ethical, And Regulatory Contexts:

China Approves Generative AI Models

China has made a significant move in AI regulation by officially approving four large generative AI models. This initiative, led by tech giants like Baidu and Tencent, sets a global precedent for the regulation of AI-generated content.

The approval process, conducted by the Ministry of Industry and Information Technology, focuses on enhancing intelligence and security across various domains. This development not only accelerates AI advancements in China but also influences global trends in AI governance, setting standards for responsible and secure AI deployment.

AI in Australian Court Bail Decisions

Australian courts are contemplating the integration of AI to enhance efficiency and reduce unconscious bias in bail decisions. With support from the Australasian Institute of Judicial Administration, this move towards technological adoption is seen as a step towards modernizing the judicial system.

However, judges in New South Wales call for thorough vetting of AI tools to ensure fairness and reliability. This exploration into AI’s potential in legal settings underscores the balance needed between technological innovation and maintaining the integrity of judicial processes.

AI Foundation Model Transparency Act

In the US, a new bill titled the AI Foundation Model Transparency Act has been proposed to regulate the use of copyrighted data by AI companies. The bill, involving agencies like the FTC and NIST, would require companies to disclose sources of training data and address model limitations.

This initiative aims to bring transparency to AI model training, especially concerning copyright issues. The bill reflects growing concerns over the ethical use of data in AI and the need for clear standards in the rapidly evolving AI industry.

The New York Times Lawsuit Against OpenAI and Microsoft

The New York Times has filed a lawsuit against OpenAI and Microsoft, claiming copyright infringement in the training of ChatGPT. The newspaper alleges that millions of its articles were used without permission, affecting its role as a news source.

Supported by the News/Media Alliance, this lawsuit highlights the broader issue of copyright in AI training. The case underscores the pressing need for clear legal frameworks around AI development, balancing innovation with respect for intellectual property and the role of journalism in responsible AI advancement.

AI-Assisted Antibiotic Discovery by MIT and Harvard

Scientists from MIT and Harvard have achieved a breakthrough in healthcare using AI, discovering a new class of antibiotics. This discovery, aimed at combating drug-resistant bacteria, involved AI screening millions of chemical compounds.

The effectiveness of these compounds was further validated in experiments with mice. This innovation addresses the global health threat posed by antibiotic resistance and could significantly impact healthcare costs and outcomes.

The work represents a critical advancement in medical research, utilizing AI to uncover new treatments and potentially save millions of lives annually.

Innovations In AI Research And Development

life2vec by DTU

Researchers from DTU, the University of Copenhagen, ITU, and Northeastern University have developed life2vec, an AI model that predicts significant life events.

Analyzing data from 6 million Danes, this transformer-based model demonstrates high accuracy in forecasting outcomes like personality traits and time of death. life2vec raises important ethical discussions around data privacy and the implications of such predictive technology, calling for a democratic conversation on its use.

Anthropic’s Revenue Growth

Anthropic, an AI firm and OpenAI competitor, forecasts an impressive $850 million in annualized revenue by the end of 2024.

This substantial growth is attributed to its language model Claude and a shift in OpenAI’s client base. Anthropic’s success, backed by significant funding, positions it as a key player in the generative AI market.

BrainGPT’s Thought-to-Text Translation

Australian researchers have developed DeWave, an AI known as BrainGPT, translating thoughts into text via EEG-recorded brainwaves.

This non-invasive method, achieving over 40% accuracy, holds promise for aiding stroke victims and controlling bionic devices. DeWave’s unique approach to encoding EEG into language represents a significant advancement in neuroscience and AI.

Finance Challenges for Large Language Models

A study by Patronus AI highlights challenges faced by large language models like OpenAI’s GPT-4 Turbo in the finance sector, particularly with SEC filings.

Despite comprehensive prompts, GPT-4 Turbo showed only 79% accuracy, underscoring the need for human oversight in financial applications of AI.

Copyright Issues in AI Bedtime Stories

The use of AI to create personalized bedtime stories featuring characters like Bluey has sparked copyright concerns.

Ludo Studio, creator of Bluey, claims infringement, raising questions about the ethical and legal aspects of digital storytelling and the preservation of human creativity in AI-generated content.

Sam Altman and Jony Ive’s AI Device

Sam Altman and Jony Ive have collaborated on a new AI hardware project, recruiting iPhone design chief Tang Tan from Apple.

Their venture, expected to disrupt tech hardware, reflects the growing influence of generative AI in user-device interaction. This project highlights the evolving landscape of AI-driven technology and its potential impact on everyday devices.

Conclusion

As we conclude this week’s AI journey, the stories we’ve encountered signify not just technological advancements but a shift in the AI paradigm.

From Apple’s venture into open-source AI to groundbreaking developments in healthcare, AI’s impact is increasingly tangible across diverse sectors. Legal and ethical considerations continue to shape AI’s trajectory, highlighting the need for responsible innovation.

The blend of human creativity with AI’s potential, whether in storytelling or healthcare, underscores a future where AI complements human skills, enriching our experiences and solutions.

These developments reflect a transformative phase in AI, where its integration into our daily lives and industries isn’t just imminent—it’s already underway, shaping a future where AI’s role is as fundamental as it is revolutionary.

Tags:

AI This Week: From Alibaba's Innovations To Microsoft's New Android App Latest AI News

AI This Week: From Alibaba’s Innovations To Microsoft’s New Android App