Programming

Mistral Releases Codestral, Its First Generative AI Model For Code (techcrunch.com) 27

Mistral, the French AI startup backed by Microsoft and valued at $6 billion, has released its first generative AI model for coding, dubbed Codestral. From a report: Codestral, like other code-generating models, is designed to help developers write and interact with code. It was trained on over 80 programming languages, including Python, Java, C++ and JavaScript, explains Mistral in a blog post. Codestral can complete coding functions, write tests and "fill in" partial code, as well as answer questions about a codebase in English. Mistral describes the model as "open," but that's up for debate. The startup's license prohibits the use of Codestral and its outputs for any commercial activities. There's a carve-out for "development," but even that has caveats: the license goes on to explicitly ban "any internal usage by employees in the context of the company's business activities." The reason could be that Codestral was trained partly on copyrighted content. Codestral might not be worth the trouble, in any case. At 22 billion parameters, the model requires a beefy PC in order to run.
AI

Mojo, Bend, and the Rise of AI-First Programming Languages (venturebeat.com) 26

"While general-purpose languages like Python, C++, and Java remain popular in AI development," writes VentureBeat, "the resurgence of AI-first languages signifies a recognition that AI's unique demands require specialized languages tailored to the domain's specific needs... designed from the ground up to address the specific needs of AI development." Bend, created by Higher Order Company, aims to provide a flexible and intuitive programming model for AI, with features like automatic differentiation and seamless integration with popular AI frameworks. Mojo, developed by Modular AI, focuses on high performance, scalability, and ease of use for building and deploying AI applications. Swift for TensorFlow, an extension of the Swift programming language, combines the high-level syntax and ease of use of Swift with the power of TensorFlow's machine learning capabilities...

At the heart of Mojo's design is its focus on seamless integration with AI hardware, such as GPUs running CUDA and other accelerators. Mojo enables developers to harness the full potential of specialized AI hardware without getting bogged down in low-level details. One of Mojo's key advantages is its interoperability with the existing Python ecosystem. Unlike languages like Rust, Zig or Nim, which can have steep learning curves, Mojo allows developers to write code that seamlessly integrates with Python libraries and frameworks. Developers can continue to use their favorite Python tools and packages while benefiting from Mojo's performance enhancements... It supports static typing, which can help catch errors early in development and enable more efficient compilation... Mojo also incorporates an ownership system and borrow checker similar to Rust, ensuring memory safety and preventing common programming errors. Additionally, Mojo offers memory management with pointers, giving developers fine-grained control over memory allocation and deallocation...

Mojo is conceptually lower-level than some other emerging AI languages like Bend, which compiles modern high-level language features to native multithreading on Apple Silicon or NVIDIA GPUs. Mojo offers fine-grained control over parallelism, making it particularly well-suited for hand-coding modern neural network accelerations. By providing developers with direct control over the mapping of computations onto the hardware, Mojo enables the creation of highly optimized AI implementations.

According to Mojo's creator, Modular, the language has already garnered an impressive user base of over 175,000 developers and 50,000 organizations since it was made generally available last August. Despite its impressive performance and potential, Mojo's adoption might have stalled initially due to its proprietary status. However, Modular recently decided to open-source Mojo's core components under a customized version of the Apache 2 license. This move will likely accelerate Mojo's adoption and foster a more vibrant ecosystem of collaboration and innovation, similar to how open source has been a key factor in the success of languages like Python.

Developers can now explore Mojo's inner workings, contribute to its development, and learn from its implementation. This collaborative approach will likely lead to faster bug fixes, performance improvements and the addition of new features, ultimately making Mojo more versatile and powerful.

The article also notes other languages "trying to become the go-to choice for AI development" by providing high-performance execution on parallel hardware. Unlike low-level beasts like CUDA and Metal, Bend feels more like Python and Haskell, offering fast object allocations, higher-order functions with full closure support, unrestricted recursion and even continuations. It runs on massively parallel hardware like GPUs, delivering near-linear speedup based on core count with zero explicit parallel annotations — no thread spawning, no locks, mutexes or atomics. Powered by the HVM2 runtime, Bend exploits parallelism wherever it can, making it the Swiss Army knife for AI — a tool for every occasion...

The resurgence of AI-focused programming languages like Mojo, Bend, Swift for TensorFlow, JAX and others marks the beginning of a new era in AI development. As the demand for more efficient, expressive, and hardware-optimized tools grows, we expect to see a proliferation of languages and frameworks that cater specifically to the unique needs of AI. These languages will leverage modern programming paradigms, strong type systems, and deep integration with specialized hardware to enable developers to build more sophisticated AI applications with unprecedented performance. The rise of AI-focused languages will likely spur a new wave of innovation in the interplay between AI, language design and hardware development. As language designers work closely with AI researchers and hardware vendors to optimize performance and expressiveness, we will likely see the emergence of novel architectures and accelerators designed with these languages and AI workloads in mind. This close relationship between AI, language, and hardware will be crucial in unlocking the full potential of artificial intelligence, enabling breakthroughs in fields like autonomous systems, natural language processing, computer vision, and more.

The future of AI development and computing itself are being reshaped by the languages and tools we create today.

In 2017 Modular AI's founder Chris Lattner (creator of the Swift and LLVM) answered questions from Slashdot readers.
Programming

Rust Foundation Reports 20% of Rust Crates Use 'Unsafe' Keyword (rust-lang.org) 92

A Rust Foundation blog post begins by reminding readers that Rust programs "are unable to compile if memory management rules are violated, essentially eliminating the possibility of a memory issue at runtime."

But then it goes on to explore "Unsafe Rust in the wild" (used for a small set of actions like dereferencing a raw pointer, modifying a mutable static variable, or calling unsafe functions). "At a superficial glance, it might appear that Unsafe Rust undercuts the memory-safety benefits Rust is becoming increasingly celebrated for. In reality, the unsafe keyword comes with special safeguards and can be a powerful way to work with fewer restrictions when a function requires flexibility, so long as standard precautions are used."

The Foundation lists those available safeguards — which "make exploits rare — but not impossible." But then they go on to analyze just how much Rust code actually uses the unsafe keyword: The canonical way to distribute Rust code is through a package called a crate. As of May 2024, there are about 145,000 crates; of which, approximately 127,000 contain significant code. Of those 127,000 crates, 24,362 make use of the unsafe keyword, which is 19.11% of all crates. And 34.35% make a direct function call into another crate that uses the unsafe keyword [according to numbers derived from the Rust Foundation project Painter]. Nearly 20% of all crates have at least one instance of the unsafe keyword, a non-trivial number.

Most of these Unsafe Rust uses are calls into existing third-party non-Rust language code or libraries, such as C or C++. In fact, the crate with the most uses of the unsafe keyword is the Windows crate, which allows Rust developers to call into various Windows APIs. This does not mean that the code in these Unsafe Rust blocks are inherently exploitable (a majority or all of that code is most likely not), but that special care must be taken while using Unsafe Rust in order to avoid potential vulnerabilities...

Rust lives up to its reputation as an excellent and transformative tool for safe and secure programming, even in an Unsafe context. But this reputation requires resources, collaboration, and constant examination to uphold properly. For example, the Rust Project is continuing to develop tools like Miri to allow the checking of unsafe Rust code. The Rust Foundation is committed to this work through its Security Initiative: a program to support and advance the state of security within the Rust Programming language ecosystem and community. Under the Security Initiative, the Rust Foundation's Technology team has developed new tools like [dependency-graphing] Painter, TypoMania [which checks package registries for typo-squatting] and Sandpit [an internal tool watching for malicious crates]... giving users insight into vulnerabilities before they can happen and allowing for a quick response if an exploitation occurs.

Microsoft

VBScript's 'Deprecation' Confirmed by Microsoft - and Eventual Removal from Windows (microsoft.com) 88

"Microsoft has confirmed plans to pull the plug on VBScript in the second half of 2024 in a move that signals the end of an era for programmers," writes Tech Radar.

Though the language was first introduced in 1996, Microsoft's latest announcement says the move was made "considering the decline in VBScript usage": Beginning with the new OS release slated for later this year [Windows 11, version 24H2], VBScript will be available as features on demand. The feature will be completely retired from future Windows OS releases, as we transition to the more efficient PowerShell experiences.
Around 2027 it will become "disabled by default," with the date of its final removal "to be determined."

But the announcement confirms VBScript will eventually be "retired and eliminated from future versions of Windows." This means all the dynamic link libraries (.dll files) of VBScript will be removed. As a result, projects that rely on VBScript will stop functioning. By then, we expect that you'll have switched to suggested alternatives.
The post recommends migirating applications to PowerShell or JavaScript.

This year's annual "feature update" for Windows will also include Sudo for Windows, Rust in the Windows kernel, "and a number of user interface tweaks, such as the ability to create 7-zip and TAR archives in File Explorer," reports the Register. "It will also include the next evolution of Copilot into an app pinned to the taskbar."

But the downgrading of VBScript "is part of a broader strategy to remove Windows and Office features threat actors use as attack vectors to infect users with malware," reports BleepingComputer: Attackers have also used VBScript in malware campaigns, delivering strains like Lokibot, Emotet, Qbot, and, more recently, DarkGate malware.
AI

AI Software Engineers Make $100,000 More Than Their Colleagues (qz.com) 43

The AI boom and a growing talent shortage has resulted in companies paying AI software engineers a whole lot more than their non-AI counterparts. From a report: As of April 2024, AI software engineers in the U.S. were paid a median salary of nearly $300,000, while other software technicians made about $100,000 less, according to data compiled by salary data website Levels.fyi. The pay gap that was already about 30% in mid-2022 has grown to almost 50%.

"It's clear that companies value AI skills and are willing to pay a premium for them, no matter what job level you're at," wrote data scientist Alina Kolesnikova in the Levels.fyi report. That disparity is more pronounced at some companies. The robotaxi company Cruise, for example, pays AI engineers at the staff level a median of $680,500 -- while their non-AI colleagues make $185,500 less, according to Levels.fyi.

Operating Systems

RISC-V Now Supports Rust In the Linux Kernel (phoronix.com) 31

Michael Larabel reports via Phoronix: The latest RISC-V port updates have been merged for the in-development Linux 6.10 kernel. Most notable with today's RISC-V merge to Linux 6.10 is now supporting the Rust programming language within the Linux kernel. RISC-V joins the likes of x86_64, LoongArch, and ARM64 already supporting the use of the in-kernel Rust language support. The use of Rust within the mainline Linux kernel is still rather limited with just a few basic drivers so far and a lot of infrastructure work taking place, but there are a number of new drivers and other subsystem support on the horizon. RISC-V now supporting Rust within the Linux kernel will become more important moving forward.

The RISC-V updates for Linux 6.10 also add byte/half-word compare-and-exchange, support for Zihintpause within hwprobe, a PR_RISCV_SET_ICACHE_FLUSH_CTX prctl(), and support for lockless lockrefs. More details on these RISC-V updates for Linux 6.10 via this Git merge.

AI

FCC Chair Proposes Disclosure Rules For AI-Generated Content In Political Ads (qz.com) 37

FCC Chairwoman Jessica Rosenworcel has proposed (PDF) disclosure rules for AI-generated content used in political ads. "If adopted, the proposal would look into whether the FCC should require political ads on radio and TV to disclose when there is AI-generated content," reports Quartz. From the report: The FCC is seeking comment on whether on-air and written disclosure should be required in broadcasters' political files when AI-generated content is used in political ads; proposing that the rules apply to both candidates and issue advertisements; requesting comment on what a specific definition of AI-generated comment should look like; and proposing that disclosure rules be applied to broadcasters and entities involved in programming, such as cable operators and radio providers.

The proposed disclosure rules do not prohibit the use of AI-generated content in political ads. The FCC has authority through the Bipartisan Campaign Reform Act to make rules around political advertising. If the proposal is adopted, the FCC will take public comment on the rules.
"As artificial intelligence tools become more accessible, the Commission wants to make sure consumers are fully informed when the technology is used," Rosenworcel said in a statement. "Today, I've shared with my colleagues a proposal that makes clear consumers have a right to know when AI tools are being used in the political ads they see, and I hope they swiftly act on this issue."
Wireless Networking

Why Your Wi-Fi Router Doubles As an Apple AirTag (krebsonsecurity.com) 73

An anonymous reader quotes a report from Krebs On Security: Apple and the satellite-based broadband service Starlink each recently took steps to address new research into the potential security and privacy implications of how their services geo-locate devices. Researchers from the University of Maryland say they relied on publicly available data from Apple to track the location of billions of devices globally -- including non-Apple devices like Starlink systems -- and found they could use this data to monitor the destruction of Gaza, as well as the movements and in many cases identities of Russian and Ukrainian troops. At issue is the way that Apple collects and publicly shares information about the precise location of all Wi-Fi access points seen by its devices. Apple collects this location data to give Apple devices a crowdsourced, low-power alternative to constantly requesting global positioning system (GPS) coordinates.

Both Apple and Google operate their own Wi-Fi-based Positioning Systems (WPS) that obtain certain hardware identifiers from all wireless access points that come within range of their mobile devices. Both record the Media Access Control (MAC) address that a Wi-FI access point uses, known as a Basic Service Set Identifier or BSSID. Periodically, Apple and Google mobile devices will forward their locations -- by querying GPS and/or by using cellular towers as landmarks -- along with any nearby BSSIDs. This combination of data allows Apple and Google devices to figure out where they are within a few feet or meters, and it's what allows your mobile phone to continue displaying your planned route even when the device can't get a fix on GPS.

With Google's WPS, a wireless device submits a list of nearby Wi-Fi access point BSSIDs and their signal strengths -- via an application programming interface (API) request to Google -- whose WPS responds with the device's computed position. Google's WPS requires at least two BSSIDs to calculate a device's approximate position. Apple's WPS also accepts a list of nearby BSSIDs, but instead of computing the device's location based off the set of observed access points and their received signal strengths and then reporting that result to the user, Apple's API will return the geolocations of up to 400 hundred more BSSIDs that are nearby the one requested. It then uses approximately eight of those BSSIDs to work out the user's location based on known landmarks.

In essence, Google's WPS computes the user's location and shares it with the device. Apple's WPS gives its devices a large enough amount of data about the location of known access points in the area that the devices can do that estimation on their own. That's according to two researchers at the University of Maryland, who theorized they could use the verbosity of Apple's API to map the movement of individual devices into and out of virtually any defined area of the world. The UMD pair said they spent a month early in their research continuously querying the API, asking it for the location of more than a billion BSSIDs generated at random. They learned that while only about three million of those randomly generated BSSIDs were known to Apple's Wi-Fi geolocation API, Apple also returned an additional 488 million BSSID locations already stored in its WPS from other lookups.
"Plotting the locations returned by Apple's WPS between November 2022 and November 2023, Levin and Rye saw they had a near global view of the locations tied to more than two billion Wi-Fi access points," the report adds. "The map showed geolocated access points in nearly every corner of the globe, apart from almost the entirety of China, vast stretches of desert wilderness in central Australia and Africa, and deep in the rainforests of South America."

The researchers wrote: "We observe routers move between cities and countries, potentially representing their owner's relocation or a business transaction between an old and new owner. While there is not necessarily a 1-to-1 relationship between Wi-Fi routers and users, home routers typically only have several. If these users are vulnerable populations, such as those fleeing intimate partner violence or a stalker, their router simply being online can disclose their new location."

A copy of the UMD research is available here (PDF).
Supercomputing

Linux Foundation Announces Launch of 'High Performance Software Foundation' (linuxfoundation.org) 4

This week the nonprofit Linux Foundation announced the launch of the High Performance Software Foundation, which "aims to build, promote, and advance a portable core software stack for high performance computing" (or HPC) by "increasing adoption, lowering barriers to contribution, and supporting development efforts."

It promises initiatives focused on "continuously built, turnkey software stacks," as well as other initiatives including architecture support and performance regression testing. Its first open source technical projects are:

- Spack: the HPC package manager.

- Kokkos: a performance-portable programming model for writing modern C++ applications in a hardware-agnostic way.

- Viskores (formerly VTK-m): a toolkit of scientific visualization algorithms for accelerator architectures.

- HPCToolkit: performance measurement and analysis tools for computers ranging from desktop systems to GPU-accelerated supercomputers.

- Apptainer: Formerly known as Singularity, Apptainer is a Linux Foundation project providing a high performance, full featured HPC and computing optimized container subsystem.

- E4S: a curated, hardened distribution of scientific software packages.

As use of HPC becomes ubiquitous in scientific computing and digital engineering, and AI use cases multiply, more and more data centers deploy GPUs and other compute accelerators. The High Performance Software Foundation will provide a neutral space for pivotal projects in the high performance computing ecosystem, enabling industry, academia, and government entities to collaborate on the scientific software.

The High Performance Software Foundation benefits from strong support across the HPC landscape, including Premier Members Amazon Web Services (AWS), Hewlett Packard Enterprise, Lawrence Livermore National Laboratory, and Sandia National Laboratories; General Members AMD, Argonne National Laboratory, Intel, Kitware, Los Alamos National Laboratory, NVIDIA, and Oak Ridge National Laboratory; and Associate Members University of Maryland, University of Oregon, and Centre for Development of Advanced Computing.

In a statement, an AMD vice president said that by joining "we are using our collective hardware and software expertise to help develop a portable, open-source software stack for high-performance computing across industry, academia, and government." And an AWS executive said the high-performance computing community "has a long history of innovation being driven by open source projects. AWS is thrilled to join the High Performance Software Foundation to build on this work. In particular, AWS has been deeply involved in contributing upstream to Spack, and we're looking forward to working with the HPSF to sustain and accelerate the growth of key HPC projects so everyone can benefit."

The new foundation will "set up a technical advisory committee to manage working groups tackling a variety of HPC topics," according to the announcement, following a governance model based on the Cloud Native Computing Foundation.
Programming

FORTRAN and COBOL Re-enter TIOBE's Ranking of Programming Language Popularity (i-programmer.info) 93

"The TIOBE Index sets out to reflect the relative popularity of computer languages," writes i-Programmer, "so it comes as something of a surprise to see two languages dating from the 1950's in this month's Top 20. Having broken into the the Top 20 in April 2021 Fortran has continued to rise and has now risen to it's highest ever position at #10... The headline for this month's report by Paul Jansen on the TIOBE index is:

Fortran in the top 10, what is going on?

Jansen's explanation points to the fact that there are more than 1,000 hits on Amazon for "Fortran Programming" while languages such as Kotlin and Rust, barely hit 300 books for the same search query. He also explains that Fortran is still evolving with the new ISO Fortran 2023 definition published less than half a year ago....

The other legacy language that is on the rise in the TIOBE index is COBOL. We noticed it re-enter the Top 20 in January 2024 and, having dropped out in the interim, it is there again this month.

More details from TechRepublic: Along with Fortran holding on to its spot in the rankings, there were a few small changes in the top 10. Go gained 0.61 percentage points year over year, rising from tenth place in May 2023 to eighth this year. C++ rose slightly in popularity year over year, from fourth place to third, while Java (-3.53%) and Visual Basic (-1.8) fell.
Here's how TIOBE ranked the 10 most popular programming languages in May:
  1. Python
  2. C
  3. C++
  4. Java
  5. C#
  6. JavaScript
  7. Visual Basic
  8. Go
  9. SQL
  10. Fortran

On the rival PYPL ranking of programming language popularity, Fortran does not appear anywhere in the top 29.

A note on its page explains that "Worldwide, Python is the most popular language, Rust grew the most in the last 5 years (2.1%) and Java lost the most (-4.0%)." Here's how it ranks the 10 most popular programming languages for May:

  1. Python (28.98% share)
  2. Java (15.97% share)
  3. JavaScript (8.79%)
  4. C# (6.78% share)
  5. R (4.76% share)
  6. PHP (4.55% share)
  7. TypeScript (3.03% share)
  8. Swift (2.76% share)
  9. Rust (2.6% share)

Programming

Apple Geofences Third-Party Browser Engine Work for EU Devices (theregister.com) 81

Apple's grudging accommodation of European law -- allowing third-party browser engines on its mobile devices -- apparently comes with a restriction that makes it difficult to develop and support third-party browser engines for the region. From a report: The Register has learned from those involved in the browser trade that Apple has limited the development and testing of third-party browser engines to devices physically located in the EU. That requirement adds an additional barrier to anyone planning to develop and support a browser with an alternative engine in the EU.

It effectively geofences the development team. Browser-makers whose dev teams are located in the US will only be able to work on simulators. While some testing can be done in a simulator, there's no substitute for testing on device -- which means developers will have to work within Apple's prescribed geographical boundary. Prior to iOS 17.4, Apple required all web browsers on iOS or iPadOS to use Apple's WebKit rendering engine. Alternatives like Gecko (used by Mozilla Firefox) or Blink (used by Google and other Chromium-based browsers) were not permitted. Whatever brand of browser you thought you were using on your iPhone, under the hood it was basically Safari. Browser makers have objected to this for years, because it limits competitive differentiation and reduces the incentive for Apple owners to use non-Safari browsers.

Science

Revolutionary Genetics Research Shows RNA May Rule Our Genome (scientificamerican.com) 80

Philip Ball reports via Scientific American: Thomas Gingeras did not intend to upend basic ideas about how the human body works. In 2012 the geneticist, now at Cold Spring Harbor Laboratory in New York State, was one of a few hundred colleagues who were simply trying to put together a compendium of human DNA functions. Their Âproject was called ENCODE, for the Encyclopedia of DNA Elements. About a decade earlier almost all of the three billion DNA building blocks that make up the human genome had been identified. Gingeras and the other ENCODE scientists were trying to figure out what all that DNA did. The assumption made by most biologists at that time was that most of it didn't do much. The early genome mappers estimated that perhaps 1 to 2 percent of our DNA consisted of genes as classically defined: stretches of the genome that coded for proteins, the workhorses of the human body that carry oxygen to different organs, build heart muscles and brain cells, and do just about everything else people need to stay alive. Making proteins was thought to be the genome's primary job. Genes do this by putting manufacturing instructions into messenger molecules called mRNAs, which in turn travel to a cell's protein-making machinery. As for the rest of the genome's DNA? The "protein-coding regions," Gingeras says, were supposedly "surrounded by oceans of biologically functionless sequences." In other words, it was mostly junk DNA.

So it came as rather a shock when, in several 2012 papers in Nature, he and the rest of the ENCODE team reported that at one time or another, at least 75 percent of the genome gets transcribed into RNAs. The ENCODE work, using techniques that could map RNA activity happening along genome sections, had begun in 2003 and came up with preliminary results in 2007. But not until five years later did the extent of all this transcription become clear. If only 1 to 2 percent of this RNA was encoding proteins, what was the rest for? Some of it, scientists knew, carried out crucial tasks such as turning genes on or off; a lot of the other functions had yet to be pinned down. Still, no one had imagined that three quarters of our DNA turns into RNA, let alone that so much of it could do anything useful. Some biologists greeted this announcement with skepticism bordering on outrage. The ENCODE team was accused of hyping its findings; some critics argued that most of this RNA was made accidentally because the RNA-making enzyme that travels along the genome is rather indiscriminate about which bits of DNA it reads.

Now it looks like ENCODE was basically right. Dozens of other research groups, scoping out activity along the human genome, also have found that much of our DNA is churning out "noncoding" RNA. It doesn't encode proteins, as mRNA does, but engages with other molecules to conduct some biochemical task. By 2020 the ENCODE project said it had identified around 37,600 noncoding genes -- that is, DNA stretches with instructions for RNA molecules that do not code for proteins. That is almost twice as many as there are protein-coding genes. Other tallies vary widely, from around 18,000 to close to 96,000. There are still doubters, but there are also enthusiastic biologists such as Jeanne Lawrence and Lisa Hall of the University of Massachusetts Chan Medical School. In a 2024 commentary for the journal Science, the duo described these findings as part of an "RNA revolution."

What makes these discoveries revolutionary is what all this noncoding RNA -- abbreviated as ncRNA -- does. Much of it indeed seems involved in gene regulation: not simply turning them off or on but also fine-tuning their activity. So although some genes hold the blueprint for proteins, ncRNA can control the activity of those genes and thus ultimately determine whether their proteins are made. This is a far cry from the basic narrative of biology that has held sway since the discovery of the DNA double helix some 70 years ago, which was all about DNA leading to proteins. "It appears that we may have fundamentally misunderstood the nature of genetic programming," wrote molecular biologists Kevin Morris of Queensland University of Technology and John Mattick of the University of New South Wales in Australia in a 2014 article. Another important discovery is that some ncRNAs appear to play a role in disease, for example, by regulating the cell processes involved in some forms of cancer. So researchers are investigating whether it is possible to develop drugs that target such ncRNAs or, conversely, to use ncRNAs themselves as drugs. If a gene codes for a protein that helps a cancer cell grow, for example, an ncRNA that shuts down the gene might help treat the cancer.

IBM

IBM Open-Sources Its Granite AI Models (zdnet.com) 10

An anonymous reader quotes a report from ZDNet: IBM managed the open sourcing of Granite code by using pretraining data from publicly available datasets, such as GitHub Code Clean, Starcoder data, public code repositories, and GitHub issues. In short, IBM has gone to great lengths to avoid copyright or legal issues. The Granite Code Base models are trained on 3- to 4-terabyte tokens of code data and natural language code-related datasets. All these models are licensed under the Apache 2.0 license for research and commercial use. It's that last word -- commercial -- that stopped the other major LLMs from being open-sourced. No one else wanted to share their LLM goodies.

But, as IBM Research chief scientist Ruchir Puri said, "We are transforming the generative AI landscape for software by releasing the highest performing, cost-efficient code LLMs, empowering the open community to innovate without restrictions." Without restrictions, perhaps, but not without specific applications in mind. The Granite models, as IBM ecosystem general manager Kate Woolley said last year, are not "about trying to be everything to everybody. This is not about writing poems about your dog. This is about curated models that can be tuned and are very targeted for the business use cases we want the enterprise to use. Specifically, they're for programming."

These decoder-only models, trained on code from 116 programming languages, range from 3 to 34 billion parameters. They support many developer uses, from complex application modernization to on-device memory-constrained tasks. IBM has already used these LLMs internally in IBM Watsonx Code Assistant (WCA) products, such as WCA for Ansible Lightspeed for IT Automation and WCA for IBM Z for modernizing COBOL applications. Not everyone can afford Watsonx, but now, anyone can work with the Granite LLMs using IBM and Red Hat's InstructLab.

Programming

Stack Overflow is Feeding Programmers' Answers To AI, Whether They Like It or Not 90

Stack Overflow's new deal giving OpenAI access to its API as a source of data has users who've posted their questions and answers about coding problems in conversations with other humans rankled. From a report: Users say that when they attempt to alter their posts in protest, the site is retaliating by reversing the alterations and suspending the users who carried them out.

A programmer named Ben posted a screenshot yesterday of the change history for a post seeking programming advice, which they'd updated to say that they had removed the question to protest the OpenAI deal. "The move steals the labour of everyone who contributed to Stack Overflow with no way to opt-out," read the updated post. The text was reverted less than an hour later. A moderator message Ben also included says that Stack Overflow posts become "part of the collective efforts" of other contributors once made and that they should only be removed "under extraordinary circumstances." The moderation team then said it was suspending his account for a week while it reached out "to avoid any further misunderstandings."
AI

OpenAI and Stack Overflow Partner To Bring More Technical Knowledge Into ChatGPT (theverge.com) 18

OpenAI and the developer platform Stack Overflow have announced a partnership that could potentially improve the performance of AI models and bring more technical information into ChatGPT. From a report: OpenAI will have access to Stack Overflow's API and will receive feedback from the developer community to improve the performance of AI models. OpenAI, in turn, will give Stack Overflow attribution -- aka link to its contents -- in ChatGPT. Users of the chatbot will see more information from Stack Overflow's knowledge archive if they ask ChatGPT coding or technical questions. The companies write in the press release that this will "foster deeper engagement with content." Stack Overflow will use OpenAI's large language models to expand its Overflow AI, the generative AI application it announced last year. Further reading: Stack Overflow Cuts 28% Workforce as the AI Coding Boom Continues (October 2023).
Programming

The BASIC Programming Language Turns 60 (arstechnica.com) 107

ArsTechnica: Sixty years ago, on May 1, 1964, at 4 am in the morning, a quiet revolution in computing began at Dartmouth College. That's when mathematicians John G. Kemeny and Thomas E. Kurtz successfully ran the first program written in their newly developed BASIC (Beginner's All-Purpose Symbolic Instruction Code) programming language on the college's General Electric GE-225 mainframe.

Little did they know that their creation would go on to democratize computing and inspire generations of programmers over the next six decades.

Operating Systems

How CP/M Launched the Next 50 Years of Operating Systems (computerhistory.org) 80

50 years ago this week, PC software pioneer Gary Kildall "demonstrated CP/M, the first commercially successful personal computer operating system in Pacific Grove, California," according to a blog post from Silicon Valley's Computer History Museum. It tells the story of "how his company, Digital Research Inc., established CP/M as an industry standard and its subsequent loss to a version from Microsoft that copied the look and feel of the DRI software."

Kildall was a CS instructor and later associate professor at the Naval Postgraduate School (NPS) in Monterey, California... He became fascinated with Intel Corporation's first microprocessor chip and simulated its operation on the school's IBM mainframe computer. This work earned him a consulting relationship with the company to develop PL/M, a high-level programming language that played a significant role in establishing Intel as the dominant supplier of chips for personal computers.

To design software tools for Intel's second-generation processor, he needed to connect to a new 8" floppy disk-drive storage unit from Memorex. He wrote code for the necessary interface software that he called CP/M (Control Program for Microcomputers) in a few weeks, but his efforts to build the electronic hardware required to transfer the data failed. The project languished for a year. Frustrated, he called electronic engineer John Torode, a college friend then teaching at UC Berkeley, who crafted a "beautiful rat's nest of wirewraps, boards and cables" for the task.

Late one afternoon in the fall of 1974, together with John Torode, in the backyard workshop of his home at 781 Bayview Avenue, Pacific Grove, Gary "loaded my CP/M program from paper tape to the diskette and 'booted' CP/M from the diskette, and up came the prompt: *

[...] By successfully booting a computer from a floppy disk drive, they had given birth to an operating system that, together with the microprocessor and the disk drive, would provide one of the key building blocks of the personal computer revolution... As Intel expressed no interest in CP/M, Gary was free to exploit the program on his own and sold the first license in 1975.

What happened next? Here's some highlights from the blog post:
  • "Reluctant to adapt the code for another controller, Gary worked with Glen Ewing to split out the hardware dependent-portions so they could be incorporated into a separate piece of code called the BIOS (Basic Input Output System)... The BIOS code allowed all Intel and compatible microprocessor-based computers from other manufacturers to run CP/M on any new hardware. This capability stimulated the rise of an independent software industry..."
  • "CP/M became accepted as a standard and was offered by most early personal computer vendors, including pioneers Altair, Amstrad, Kaypro, and Osborne..."
  • "[Gary's company] introduced operating systems with windowing capability and menu-driven user interfaces years before Apple and Microsoft... However, by the mid-1980s, in the struggle with the juggernaut created by the combined efforts of IBM and Microsoft, DRI had lost the basis of its operating systems business."
  • "Gary sold the company to Novell Inc. of Provo, Utah, in 1991. Ultimately, Novell closed the California operation and, in 1996, disposed of the assets to Caldera, Inc., which used DRI intellectual property assets to prevail in a lawsuit against Microsoft."

Red Hat Software

Red Hat Upgrades Its Pipeline-Securing (and Verification-Automating) Tools (siliconangle.com) 11

SiliconANGLE reports that to help organizations detect vulnerabilities earlier, Red Hat has "announced updates to its Trusted Software Supply Chain that enable organizations to shift security 'left' in the software supply chain." Red Hat announced Trusted Software Supply Chain in May 2023, pitching it as a way to address the rising threat of software supply chain attacks. The service secures software pipelines by verifying software origins, automating security processes and providing a secure catalog of verified open-source software packages. [Thursday's updates] are aimed at advancing the ability for customers to embed security into the software development life cycle, thereby increasing software integrity earlier in the supply chain while also adhering to industry regulations and compliance standards.

They start with a new tool called Red Hat Trust Artifact Signer. Based on the open-source Sigstore project [founded at Red Hat and now part of the Open Source Security Foundation], Trust Artifact Signer allows developers to sign and verify software artifacts cryptographically without managing centralized keys, to enhance trust in the software supply chain. The second new release, Red Hat Trusted Profile Analyzer, provides a central source for security documentation such as Software Bill of Materials and Vulnerability Exploitability Exchange. The tool simplifies vulnerability management by enabling proactive identification and minimization of security threats.

The final new release, Red Hat Trusted Application Pipeline, combines the capabilities of the Trusted Profile Analyzer and Trusted Artifact Signer with Red Hat's internal developer platform to provide integrated security-focused development templates. The feature aims to standardize and accelerate the adoption of secure development practices within organizations.

Specifically, Red Hat's announcement says organizations can use their new Trust Application Pipeline feature "to verify pipeline compliance and provide traceability and auditability in the CI/CD process with an automated chain of trust that validates artifact signatures, and offers provenance and attestations."
Programming

'Women Who Code' Shuts Down Unexpectedly (bbc.com) 107

Women Who Code (WWC), a U.S.-based organization of 360,000 people supporting women who work in the tech sector, is shutting down due to a lack of funding. "It is with profound sadness that, today, on April 18, 2024, we are announcing the difficult decision to close Women Who Code, following a vote by the Board of Directors to dissolve the organization," the organization said in a blog post. "This decision has not been made lightly. It only comes after careful consideration of all options and is due to factors that have materially impacted our funding sources -- funds that were critical to continuing our programming and delivering on our mission. We understand that this news will come as a disappointment to many, and we want to express our deepest gratitude to each and every one of you who have been a part of our journey." The BBC reports: WWC was started 2011 by engineers who "were seeking connection and support for navigating the tech industry" in San Francisco. It became a nonprofit organization in 2013 and expanded globally. In a post announcing its closure, it said it had held more than 20,000 events and given out $3.5m in scholarships. A month before the closure, WWC had announced a conference for May, which has now been cancelled.

A spokesperson for WWC said: "We kept our programming moving forward while exploring all options." They would not comment on questions about the charity's funding. The most recent annual report, for 2022, showed the charity made almost $4m that year, while its expenses were just under $4.2m. WWC said that "while so much has been accomplished," their mission was not complete. It continued: "Our vision of a tech industry where diverse women and historically excluded people thrive at every level is not fulfilled."

Movies

Struggling Movie Exhibitors Beg Studios For More Movies - and Not Just Blockbusters (yahoo.com) 120

Movie exhibitors still face "serious risks," the Los Angeles Times reported Tuesday: Attendance was on the decline even before the pandemic shuttered theaters, thanks to changing consumer habits and competition for people's time and money from other entertainment options. The industry has demonstrated an over-reliance on Imax-friendly studio action tent poles, when theater chains need a deep and diverse roster of movies in order to thrive... It remains to be seen whether the global box office will ever get back to the $40 billion-plus days of 2019 and earlier years. A clearer picture will emerge in 2025 when the writers' and actors' strikes are further in the past. But overall, there's a strong case that moviegoing has proved to be relatively sturdy despite persistent difficulties.
Which brings us to this year's CinemaCon convention, where multiplex operators heard from Hollywood studios teasing upcoming blockbusters like Joker: Folie à Deux, Furiosa: A Mad Max Saga, Transformers One, and Deadpool & Wolverine. Exhibitors pleaded with the major studios to release more films of varying budgets on the big screen, while studios made the case that their upcoming slates are robust enough to keep them in business... Box office revenue in the U.S. and Canada is expected to total about $8.5 billion, which is down from $9 billion in 2023 and a far cry from the pre-pandemic yearly tallies that nearly reached $12 billion... Though a fuller release schedule is expected for 2025, talk of budget cuts, greater industry consolidation and corporate mergers has forced exhibitors to prepare for the possibility of a near future with fewer studios making fewer movies....

As the domestic film business has been thrown into turmoil in recent years, Japanese cinema and faith-based content have been two of movie theaters' saving graces. Industry leaders kicked off CinemaCon on Tuesday by singing the praises of Sony-owned anime distributor Crunchyroll's hits — including the latest "Demon Slayer" installment. Mitchel Berger, senior vice president of global commerce at Crunchyroll, said Tuesday that the global anime business generated $14 billion a decade ago and is projected to generate $37 billion next year. "Anime is red hot right now," Berger said. "Fans have known about it for years, but now everyone else is catching up and recognizing that it's a cultural, economic force to be reckoned with.... " Another type of product buoying the exhibition industry right now is faith-based programming, shepherded in large part by "Sound of Freedom" distributor Angel Studios...

Theater owners urged studio executives at CinemaCon to put more films in theaters — and not just big-budget tent poles timed for summer movie season and holiday weekends... "Whenever we have a [blockbuster] film — whether it be 'Barbie' or 'Super Mario' ... records are set," added Bill Barstow, co-founder of ACX Cinemas in Nebraska. "But we just don't have enough of them."

Slashdot Top Deals