Skip to content
This repository was archived by the owner on May 2, 2018. It is now read-only.

Financial-Times/v1-suggestor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Circle CIGo Report Card Coverage Status

V1 suggestor

Processes metadata about content that comes from QMI system - aka V1 annotations.

  • Reads V1 metadata for an article from the kafka source topic NativeCmsMetadataPublicationEvents
  • Filters and transforms it to UP standard json representation
  • Puts the result onto the kafka destination topic ConceptSuggestions

v1-suggestor service communicates with kafka via http-rest-proxy. It polls kafka-rest-proxy for messages and POSTs transformed messages to kafka-rest-proxy.
This service is deployed in the Delivery clusters.

Installation

go get -u github.com/kardianos/govendor
go get -u github.com/Financial-Times/v1-suggestor
cd $GOPATH/src/github.com/Financial-Times/v1-suggestor
govendor sync
go build .

Startup parameters

Parameter Value in prod Explained
SRC_ADDR http://localhost:8080 Url of the http-rest-proxy host to connect to in order to receive messages from kafka.
SRC_GROUP v1Suggestor The consumer group for receiving messages from kafka.
SRC_TOPIC NativeCmsMetadataPublicationEvents kafka topic to consume messages from.
SRC_QUEUE kafka Used by Vulcan to route http requests based on Host header. In docker cluster all hosts are at http://localhost:8080. This http header is supplied to distinguish one service from another. Host header kafka points to http-rest-proxy.
SRC_CONCURRENT_PROCESSING false Should the consumer process messages concurrently or sequentially.
DEST_ADDRESS http://localhost:8080 Url of the http-rest-proxy host to connect to in order to send messages to kafka. In prod env this is typically the same address as the SRC_ADDR.
DEST_TOPIC ConceptSuggestions kafka topic to send messages to.
DEST_QUEUE kafka Used by Vulcan to route http requests based on Host header. In prod docker cluster it is the same as SRC_QUEUE.

Prerequisites

In order to run v1-suggestor you would need at least kafka/zookeeper and kafka-rest-proxy to be accessible somewhere and you would need to provide the host and the port to connect to them as startup parameters.

Run locally

   export|set SRC_ADDR=http://kafkahost:8080
   export|set SRC_GROUP=FooGroup
   export|set SRC_TOPIC=FooBarEvents
   export|set SRC_QUEUE=kafka
   export|set SRC_CONCURRENT_PROCESSING=true
   export|set DEST_ADDRESS=http://kafkahost:8080
   export|set DEST_TOPIC=DestTopic
   export|set DEST_QUEUE=kafka
   export|set ENVIRONMENT=coco-semantic
   export|set DOCKER_APP_VERSION=latest
./v1-suggestor[.exe]

Build in Docker

git config remote.origin.url https://github.com/Financial-Times/v1-suggestor.git
docker build -t coco/v1-suggestor:$DOCKER_APP_VERSION .
git config remote.origin.url git@github.com:Financial-Times/v1-suggestor.git

#Run in Docker


docker run --name v1-suggestor -p 8080 \
--env "SRC_ADDR=http://kafka:8080" \
	--env "SRC_GROUP=v1Suggestor" \
	--env "SRC_TOPIC=NativeCmsMetadataPublicationEvents" \
	--env "SRC_QUEUE=kafka" \
	--env "SRC_CONCURRENT_PROCESSING=false" \
	--env "DEST_ADDRESS=http://kafka:8080" \
	--env "DEST_TOPIC=ConceptSuggestions" \
	--env "DEST_QUEUE=kafka" \
	--env "ENVIRONMENT=coco-$ENVIRONMENT_TAG" \
	coco/v1-suggestor:$DOCKER_APP_VERSION

Admin Endpoints

===Endpoint === Explained
/__health checks that v1-suggestor can communicate to kafka via http-rest-proxy
/__ping response status: 200 body:"pong"
/ping the same as above for compatibility with Dropwizard java apps
/__gtg response status: 200 when "good to go" or 503 when not "good to go"
/__build-info consisting of version (release tag), git repository url, revision (git commit-id), deployment datetime, builder (go or java or ...)
/build-info the same as above for compatibility with Dropwizard java apps

Example Message-In

FTMSG/1.0  
Content-Type: application/json  
Message-Id: 266c7604-b582-47a3-9b7e-c8aad93f1ec9  
Message-Timestamp: 2016-12-29T14:54:10.160Z  
Message-Type: cms-content-published  
Origin-System-Id: http://cmdb.ft.com/systems/binding-service  
X-Request-Id: tid_9rvfuynl4b  
{"value":"<base64 encoded message body>"}  

Decoded Message-In body

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>  
<ns5:contentRef ns5:created="2016-12-29T14:54:10.000Z" ns5:id="3505101" 
	xmlns:ns14="http://metadata.internal.ft.com/metadata/xsd/metadata_concept_v1.0.xsd" 
	xmlns:ns9="http://metadata.internal.ft.com/metadata/xsd/metadata_taxonomy_v1.0.xsd" 
	xmlns:ns5="http://metadata.internal.ft.com/metadata/xsd/metadata_content_reference_v1.0.xsd" 
	xmlns:ns12="http://metadata.internal.ft.com/metadata/xsd/metadata_notification_v1.0.xsd" 
	xmlns:ns13="http://metadata.internal.ft.com/metadata/xsd/metadata_search_v1.0.xsd" 
	xmlns:ns6="http://metadata.internal.ft.com/metadata/xsd/metadata_tag_v1.0.xsd" 
	xmlns:ns7="http://metadata.internal.ft.com/metadata/xsd/metadata_binding_v1.0.xsd" 
	xmlns:ns10="http://metadata.internal.ft.com/metadata/xsd/metadata_suggestion_v1.0.xsd" 
	xmlns:ns8="http://metadata.internal.ft.com/metadata/xsd/metadata_property_v1.0.xsd" 
	xmlns:ns11="http://metadata.internal.ft.com/metadata/xsd/metadata_count_response_v1.0.xsd" 
	xmlns:ns2="http://metadata.internal.ft.com/metadata/xsd/metadata_party_v1.0.xsd" 
	xmlns:ns1="http://metadata.internal.ft.com/metadata/xsd/metadata_base_v1.0.xsd" 
	xmlns:ns4="http://metadata.internal.ft.com/metadata/xsd/metadata_term_v1.0.xsd" 
	xmlns:ns3="http://metadata.internal.ft.com/metadata/xsd/metadata_lifecycle_v1.0.xsd">  
	<ns5:primarySection ns4:status="ACTIVE" ns4:externalTermId="116" ns4:taxonomy="Sections" ns1:id="MTE2-U2VjdGlvbnM=">  
	<ns4:canonicalName>  
	Comment</ns4:canonicalName>  
</ns5:primarySection>  
<ns5:primaryTheme ns4:status="ACTIVE" ns4:externalTermId="a8e4a619-3c38-41fd-9e20-8ac64ed06447" ns4:taxonomy="Topics" ns1:id="YThlNGE2MTktM2MzOC00MWZkLTllMjAtOGFjNjRlZDA2NDQ3-VG9waWNz">  
	<ns4:canonicalName>  
	Global politics</ns4:canonicalName>  
</ns5:primaryTheme>  
<ns5:tags>  
	<ns6:tag>  
	<ns6:meta ns1:provenance="USER"/>  
<ns6:term ns4:status="ACTIVE" ns4:externalTermId="a8e4a619-3c38-41fd-9e20-8ac64ed06447" ns4:taxonomy="Topics" ns1:id="YThlNGE2MTktM2MzOC00MWZkLTllMjAtOGFjNjRlZDA2NDQ3-VG9waWNz">  
	<ns4:canonicalName>  
	Global politics</ns4:canonicalName>  
</ns6:term>  
<ns6:score ns6:relevance="100" ns6:confidence="100"/>  
</ns6:tag>  
<ns6:tag>  
	<ns6:meta ns1:provenance="USER"/>  
<ns6:term ns4:status="ACTIVE" ns4:externalTermId="8" ns4:taxonomy="Genres" ns1:id="OA==-R2VucmVz">  
	<ns4:canonicalName>  
	Comment</ns4:canonicalName>  
</ns6:term>  
<ns6:score ns6:relevance="100" ns6:confidence="100"/>  
</ns6:tag>  
<ns6:tag>  
	<ns6:meta ns1:provenance="USER"/>  
<ns6:term ns4:status="ACTIVE" ns4:externalTermId="116" ns4:taxonomy="Sections" ns1:id="MTE2-U2VjdGlvbnM=">  
	<ns4:canonicalName>  
	Comment</ns4:canonicalName>  
</ns6:term>  
<ns6:score ns6:relevance="100" ns6:confidence="100"/>  
</ns6:tag>  
<ns6:tag>  
	<ns6:meta ns1:provenance="PREPROCESSOR"/>  
<ns6:term ns4:status="ACTIVE" ns4:externalTermId="f30ca667-0056-4e98-b41e-f99196e324ef" ns4:taxonomy="MediaTypes" ns1:id="ZjMwY2E2NjctMDA1Ni00ZTk4LWI0MWUtZjk5MTk2ZTMyNGVm-TWVkaWFUeXBlcw==">  
	<ns4:canonicalName>  
	Text</ns4:canonicalName>  
</ns6:term>  
<ns6:score ns6:relevance="100" ns6:confidence="100"/>  
</ns6:tag>  
</ns5:tags>  
<ns5:externalReferences>  
	<ns7:reference ns1:cmrId="1227570" ns1:externalId="980913e6-cdd6-11e6-864f-20dcb35cede2" ns1:externalSource="METHODE"/>  
</ns5:externalReferences>  
</ns5:contentRef>  


Note: Brigthcove video metadata is the same pipeline as metadata for Methode  articles and Wordpress blogs, so brands are added in the same way.