A Model Context Protocol (MCP) server for web content scanning and analysis. This server provides tools for fetching, analyzing, and extracting information from web pages.
- Page Fetching: Convert web pages to Markdown for easy analysis
- Link Extraction: Extract and analyze links from web pages
- Site Crawling: Recursively crawl websites to discover content
- Link Checking: Identify broken links on web pages
- Pattern Matching: Find URLs matching specific patterns
- Sitemap Generation: Generate XML sitemaps for websites
To install Webscan for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install mcp-server-webscan --client claude
# Clone the repository
git clone <repository-url>
cd mcp-server-webscan
# Install dependencies
npm install
# Build the project
npm run build
npm start
The server runs on stdio transport, making it compatible with MCP clients like Claude Desktop.
-
fetch_page
- Fetches a web page and converts it to Markdown
- Parameters:
url
(required): URL of the page to fetchselector
(optional): CSS selector to target specific content
-
extract_links
- Extracts all links from a web page with their text
- Parameters:
url
(required): URL of the page to analyzebaseUrl
(optional): Base URL to filter links
-
crawl_site
- Recursively crawls a website up to a specified depth
- Parameters:
url
(required): Starting URL to crawlmaxDepth
(optional, default: 2): Maximum crawl depth
-
check_links
- Checks for broken links on a page
- Parameters:
url
(required): URL to check links for
-
find_patterns
- Finds URLs matching a specific pattern
- Parameters:
url
(required): URL to search inpattern
(required): Regex pattern to match URLs against
-
generate_sitemap
- Generates a simple XML sitemap
- Parameters:
url
(required): Root URL for sitemapmaxUrls
(optional, default: 100): Maximum number of URLs to include
- Configure the server in your Claude Desktop settings:
{
"mcpServers": {
"webscan": {
"command": "node",
"args": ["path/to/mcp-server-webscan/dist/index.js"],
"env": {
"NODE_ENV": "development"
}
}
}
}
- Use the tools in your conversations:
Could you fetch the content from https://example.com and convert it to Markdown?
- Node.js >= 18
- npm
mcp-server-webscan/
├── src/
│ └── index.ts # Main server implementation
├── dist/ # Compiled JavaScript
├── package.json
└── tsconfig.json
npm run build
npm run dev
The server implements comprehensive error handling:
- Invalid parameters
- Network errors
- Content parsing errors
- URL validation
All errors are properly formatted according to the MCP specification.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
MIT License - see the LICENSE file for details