Merge field data
Updated over a week ago

You are browsing a tutorial guide for the latest Octoparse version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

The "Merge field data" feature can be used to easily combine data of different rows into ONE single row.

Let's suppose you need to extract an article from a blog. In some cases, you might not be able to select the entire article to extract as there are different paragraphs, but you still want all the paragraphs in one single row

instead of having different paragraphs in different rows like this:

This is the perfect time to take advantage of the "Merge field data" feature for combining the extracted data into one single row of data. Let's see how to get this done with an example.

Here we use blog content from https://philipyancey.com/a-view-from-abroad to demonstrate.

There are two steps to merging the rows:

1. Select the desired data to extract

  • Click on the first paragraph of the article and choose Select all similar elements in the Tips panel. A Loop Item will be created to extract every paragraph of the post.

  • Select Text on the Tips panel


2. Merge the extracted data

  • Click on the Extract Data step and go to the Data Preview panel

4.png
  • Click on the More button and select Merge field data

You are all set! Let's run the task and see what the actual exported data looks like. You can see that paragraphs captured in Field 1 are now merged into a single row as one big chunk.

3.png

Note:

  1. Merge field data is especially useful for extracting articles from any website. You can extract the article as one whole chunk with no other elements like blank lines, comments, or images.

  2. When the data are conglomerated as one big chunk, you can further use Data reformat tools to add a prefix or suffix, such as "|" and "\" to reformat the data.

  3. If there are multiple fields to extract, you need to set up "Merge field data" for every field.

  4. This feature can also be used to merge two fields. Use two Extract Data in the workflow, one field in one Extract Data action, then name the fields the same and set the "merge multiple rows" for the fields. As a result, the data scraped in the two fields will be merged into one cell.

  5. This feature cannot be previewed. The data will only be merged when the task runs.

Did this answer your question?