IBM Open source, pre-trained foundation models make AI and automation easier than ever before

Sometimes the problem with artificial intelligence (AI) and automation is that they are too labor-intensive. That sounds like a joke, but we’re quite serious. Traditional AI tools, especially deep learning-based ones, require huge amounts of effort to use. You need to collect, curate, and annotate data for any specific task you want to perform. This is often a very cumbersome exercise that takes significant amount of time to field an AI solution that yields business value. And then you need highly specialized, expensive and difficult to find skills to work the magic of training an AI model. If you want to start a different task or solve a new problem, you often must start the whole process over again—it’s a recurring cost.

But that’s all changing thanks to pre-trained, open source foundation models. With a foundation model, often using a kind of neural network called a “transformer” and leveraging a technique called self-supervised learning, you can create pre-trained models for a vast amount of unlabeled data. The model can learn the domain-specific structure it’s working on before you even start thinking about the problem that you’re trying to solve. This is usually text, but it can also be code, IT events, time series, geospatial data, or even molecules.

Starting from this foundation model, you can start solving automation problems easily with AI and using very little data—in some cases, called few-shot learning, just a few examples. In other cases, it’s sufficient to just describe the task you’re trying to solve.

Solving the risks of massive datasets and re-establishing trust for generative AI

Some foundation models for natural language processing (NLP), for instance, are pre-trained on massive amounts of data from the internet. Sometimes, you don’t know what data a model was trained on because the creators of those models won’t tell you. And those massive large-scale datasets contain some of the darker corners of the internet. It becomes difficult to ensure that the model algorithms outputs aren’t biased, or even toxic. This is an open, hard problem for the entire field of AI applications. At IBM, we want to infuse trust into everything we do, and we’re building our own foundation models with transparency at their core for clients to use.

As a first step, we’re carefully curating an enterprise-ready data set using our data lake tooling to serve as a foundation for our, well, foundation models. We’re carefully removing problematic datasets, and we’re applying AI-based hate and profanity filters to remove objectionable content. That’s an example of negative curation—removing things.

We also do positive curation—adding things we know our clients care about. We’ve curated a rich set of data from enterprise-relevant domains—finance, legal and regulatory, cybersecurity, sustainability data. Datasets like this are measured in how many “tokens”—think of those as words or word parts—that we’re including. We’re targeting a 2 trillion token dataset, which would make it among the largest that anyone has assembled.

Next, we’re training the models, bringing together best-in-class innovations from the open community and those developed by IBM Research. Over the next few months, we’ll be making these models available for clients, alongside the open-source model catalog mentioned earlier.

Harnessing the power of foundation models at scale

Foundation models represent a paradigm shift in AI, one that requires not only a new technical stack to allow hybrid cloud environments to flourish, but also fundamentally new user interactions that harness the power of these models for enterprise. Coming soon, our enterprise-ready next-generation AI studio for AI builders, has two tools for generative AI capabilities powered by foundation models to help bridge this gap for clients: a Prompt Lab and a Prompt Tuning Studio.

The Prompt Lab

The Prompt Lab enables users to rapidly explore and build solutions with large language and code models by experimenting with prompts. Prompts are simple text inputs that effectively nudge the model to do your bidding with direct instructions. Prompts can also include a few examples to guide the model towards the exact behavior you’re looking for.

With language models, all you have to do is write the instructions in natural language. It usually takes a certain amount of trial and error to craft the right prompt that can enables the model to generate the desired result, a new field called prompt engineering. For instance, within the Prompt Lab, users can leverage different prompts for both zero-shot prompting and few-shot prompting to accomplish different tasks such as:

  • Generate text for marketing campaign: Create high-quality content for marketing campaigns given target audiences, campaign parameters, and other keywords.
  • Extract facts from SEC 10-K filings: Extract details from dense financial filings, like Maximum Borrowing Capacity 10-K filings.
  • Summarize meeting transcripts: Summarize a transcript from a meeting, understanding key takeaways without having to read through the entire conversation.
  • Answer questions about an article or dynamic content. Use this to build a question-answering interface grounded on specific content and recommend optimal next steps to provide customer service assistance.

With Prompt Lab, practically anyone can harness the power of foundation models for enterprise use cases. Engineers and developers can also use our APIs to embed these capabilities into external and internal applications. We’re actively working on more enhanced developer experience that offers useful libraries and code samples.

The Tuning Studio

With the Tuning Studio, users can further customize foundation model behavior using a state-of the art method that requires as few a 100 to 1,000 examples. By using advanced prompt tuning within, you can efficiently create and deploy a foundation model that is customized to your data.

Tuning can be useful to adapt existing models to domain-specific tasks (i.e., learn new tasks). It also allows enterprises to harness their proprietary data to differentiate their applications.

In the Tuning Studio, all you have to do is specify your task and provide labelled examples in the required format. Once the model training is complete, you can deploy the model and use it in both the Prompt Lab and via an API.

What are we doing ahead of the release?

As we gear up towards our broader release in July, we’re actively seeing new use cases being built out through our Tech Preview program. We are investing in a roadmap of state-of-the-art tooling to efficiently customize models with proprietary data. We’re improving our Prompt Lab with interfaces that help novice users construct better prompts and guide the models to providing the right answers more quickly.

In addition, we recently open-sourced a preview of our python SDK and announced a partnership with Hugging Face to integrate their open-source libraries into The foundation model capabilities within fit into a greater data and AI platform, watsonx, alongside two other key pillars and watsonx.governance. Together, watsonx offers organizations the ability to:

  • Train, turn and deploy AI across your business with
  • Scale AI workloads, for all your data, anywhere with
  • Enable responsible, transparent and explainable data and AI workflow with watsonx.governance

Previous Topic

Message replication configuration trade-offs

Parent Topic

IBM Global Mailbox - Technical overview

Thank you for the Registration Request, Our team will confirm your request shortly.

Invite and share the event with your colleagues 

FileGPS - End To End File Monitoring

Subscribe to our newsletter

Elevate your approach to technology with expert-authored blogs, articles, and industry perspectives.

Thank You!

Thanks for signing up! We look forward to sharing resources and updates with you.

Continue to view our resources below.

Thank You!

Your Article submission request has been successfully sent. We will review your article & contact you very soon!

Sign up for Free Trail

Community manager solution extends IBM Sterling B2B Integrator to deliver next-generation B2B and File Transfer capabilities that meet the growing demands of the customer.

Thank You!

Thanks for contacting us. We will reach you shortly!

Select Industry & Watch IBM Partner Engagement Manager Demo

Start SRE Journey to AIOPs

FileGPS - End To End File Monitoring

Pragma Edge Jarvis Monitoring tool (Jarvis)

Thank you for submitting your details.

For more information, Download the PDF.

Community Manager - PCM

Community Manager - PCM

To deliver next-generation B2B and File Transfer capabilities

Pragma Edge - API Connect

IBM Partner Engagement Manager Standard

IBM Partner Engagement Manager Standard is the right solution
addressing the following business challenges

IBM Partner Engagement Manager Standard

IBM Partner Engagement Manager Standard is the right solution
addressing the following business challenges

IBM Partner Engagement Manager Standard

IBM Partner Engagement Manager Standard is the right solution
addressing the following business challenges

Thank you for the Registration Request, Our team will confirm your request shortly.

Invite and share the event with your colleagues 

Please Join us
On April 21 2021, 11 AM CT