This Node.js script extracts URLs from one or more sitemap.xml files and appends them to a text file.
- Node.js v10 or higher
- The following Node.js modules:
-
- fs
-
- https
-
- xml2js
Clone this repository to your local machine:
git clone https://github.com/your-username/sitemap-url-extractor.git
Navigate to the project directory:
cd sitemap-url-extractor
Install the required Node.js modules:
npm install
Open the index.js file in a text editor.
Modify the sitemapUrls array to include the URLs of the sitemap.xml files you want to extract URLs from OR you could load sitemap.xml urls from sitemaps.txt file.
Save the changes to the index.js file.
Run the script using Node.js:
node index.js
The script will fetch each sitemap.xml file, extract the URLs, and append them to a file called urls.txt in the project directory.