Install LM Studio and download models
Before we begin, ensure you have the following installed:
After you have downloaded and installed LM Studio, open the
application. Go to the Discover tab (sidebar), where
you can browse and search for models. In this example, we will be using
the Phi-3-mini-4k-instruct
model, but you can of course experiment with any other model that you
prefer – as long as you’ve got the hardware to run it!
Now, select the model from the top bar to load it:
To check that everything is working fine, go to the
Chat tab on the sidebar and start a new chat to
interact with the Phi-3 model directly. You’ve now got your language
model up and running!
Required R Packages
To effectively work with LM Studio, we will need several R
packages:
- tidyverse – for data manipulation
- httr – for API interaction
- jsonlite – for JSON parsing
You can install/update them all with one line of code:
# Install necessary packages install.packages(c("tidyverse", "httr", "jsonlite"))
Let us set up the R script by loading the packages and the data we
will be working with:
# Load the packages library(tidyverse) library(httr) library(jsonlite) top_100_climbs_df <- read_csv("https://raw.githubusercontent.com/martinctc/blog/refs/heads/master/datasets/top_100_climbs.csv")
The top_100_climbs_df
dataset contains information on
the top 100 cycling climbs in the UK, which I’ve pulled from the Cycling Uphill website,
originally put together by Simon Warren. These
are 100 rows, and the following columns in the dataset:
climb_id
: row unique identifier for the climbclimb
: name of the climbheight_gain_m
: height gain in metersaverage_gradient
: average gradient of the climblength_km
: total length of the climb in kilometersmax_gradient
: maximum gradient of the climburl
: URL to the climb’s page on Cycling Uphill
Here is what the dataset looks like when we run
dplyr::glimpse()
:
glimpse(top_100_climbs_df) ## Rows: 100 ## Columns: 7 ## $ climb_id1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16… ## $ climb "Cheddar Gorge", "Weston Hill", "Crowcombe Combe", "P… ## $ height_gain_m 150, 165, 188, 372, 326, 406, 166, 125, 335, 163, 346… ## $ average_gradient 0.05, 0.09, 0.15, 0.12, 0.10, 0.04, 0.11, 0.11, 0.06,… ## $ length_km 3.5, 1.8, 1.2, 4.9, 3.2, 11.0, 1.5, 1.1, 5.4, 1.4, 9.… ## $ max_gradient 0.16, 0.18, 0.25, 0.25, 0.17, 0.12, 0.25, 0.18, 0.12,… ## $ url "https://cyclinguphill.com/cheddar-gorge/", "https://…
Our goal here is to use this dataset to generate text descriptions
for each of the climbs using the language model. Since this is for text
generation, we will do a bit of cleaning up of the dataset, converting
gradient values to percentages:
top_100_climbs_df_clean <- top_100_climbs_df %>% mutate( average_gradient = scales::percent(average_gradient), max_gradient = scales::percent(max_gradient) )