Using information from open-ended plain text

Using information from open-ended plain text

Some datasets contain a variable with open-ended, written responses, or a mish-mash of text. Like this:


or this:


Let's say we want to create a variable that indicates whether the blob of text in a given row contains the phrase "Dual pane windows"

Start by clicking "Create"...


...then "Variable by Filters"...


Then complete the form:

  • Give the variable a name. Below we named it "Dual Pane Windows"
  • Select the text blob variable in question. Below we selected "Features". If you don't see your variable in the list, you'll need to unhide it first.
  • Select "Contains"
  • Write in the text that you'd like to search for within that text blog, then press enter or tab. Below we want to look for the phrase "Dual Pane".
  • Write in the value you want to show if a given row of data does in fact contain that string. Below if the phrase "Dual Pane" is found the value will be set to "Yes"
  • Write into the last box available the value you want to be set if the previous condition isn't met. Below we want any row that does not contain "Dual Pane" to have the value of "No"

You'll then click "Create" and your new variable will appear.