We’re assisting several clients right now with Marketo migrations. As large companies utilize enterprise solutions like this, it’s like a spider web that weaves itself into processes and platforms over years… until the point that companies aren’t even aware of every touchpoint.
With an enterprise marketing automation platform like Marketo, forms are the entry point of data throughout sites and landing pages. Companies often have thousands of pages and hundreds of forms throughout their sites that need to be identified for updating.
A great tool for this is Screaming Frog’s SEO Spider… perhaps the most popular platform in the market for crawling, auditing, and extracting data from a site. The platform is feature-rich and offers hundreds of options for virtually every task you require.
Screaming Frog SEO Spider: Crawl And Extract
A key feature of Screaming Frog SEO Spider is that you can perform custom extractions based on Regex, XPath, or CSSPath specifics. This comes in extremely useful as we wish to crawl the client’s sites and audit and capture the MunchkinID and FormId values from pages.
With the tool, open Configuration > Custom > Extraction to identify elements you wish to extract.
The extraction screen allows for virtually unlimited data collection:
Regex, XPath, and CSSPath Extraction
For the MunchkinID, the identifier is located within the form script that’s within the page:
We then apply a Regex rule to capture the id from within the script tag that’s inserted in the page:
Regex: ["']id["']: *["'](.*?)["']
For the Form ID, the data is in an input tag within the Marketo form:
<input type="hidden" name="formid" class="mktoField mktoFieldDescriptor" value="1234">
We apply an XPath rule to capture the id from within the form that’s inserted in the page. The XPath query looks for a form with an input with a name of formid, then the extraction saves the value:
While this is a very specific application, it’s an incredibly useful one as you’re working with large sites. You’ll absolutely want to audit where your forms are embedded throughout the site.