Gsoc:OKS
Participants
- Mentors
- Alexander Morlang (FOKUS)
- Thomas Hirsch (FOKUS)
- Mario Behling
- Technical Support
- Carsten Schmoll (FOKUS)
- Students (GSoC)
- Sagie Amir
- Stefano Pilla
- Adnan Ozsoy
- Alexey Mikhailov
- Students (OKS)
- Johannes Tigges
- Felix Fietkau
- Stagees (FOKUS)
- Robert Wuttke
Channels
- repositories:
- wiki:
- mailing list:
- irc: irc://irc.freenode.org/#freifunk
Development
Service Discovery / Avahi / MDNS (Stefano)
Synopsis
The aim of this project is to add interactive mDNS Service Discovery and service monitoring features to the Freimap software.
Design and implementation
The idea is to achieve a two component architecture: a "server"component on board of the mesh nodes, obtained by enhancing the olsrd mDNS plug-in, and a "client" component, obtained by enhancing the Freimap software. Moreover, to permit the communication between the two components, a suitable protocol/format has to be devised.
mDNS (Multicast DNS) is a protocol, derived from unicast DNS (Domain Name System), aimed at service discovery over local networks. Through mDNS an user can, for instance, chat and exchange files with other users on the same local network, or find which local printers and web servers are connected.
The scope of mDNS packets is a single collision domain. For this reason, in an OLSR mesh network, mDNS packets generated by attached hosts cannot traverse OLSR nodes. The olsrd mDNS plug-in permits,through an extension of the OLSR protocol, the diffusion of mDNS packets in an OLSR mesh network. The nodes with this plug-in are able to capture mDNS packets on their attached (HNA) networks, encapsulate them in a new type of OLSR message which can be spread in the whole mesh network, and also to decapsulate this type of OLSR messages and inject the contained mDNS packets in their attached networks. A remarkable feature of this plug-in is that all nodes in the mesh, including the ones without the plug-in, forward mDNS OLSR messages using the OLSR default forwarding algorithm, permitting the use of the plug-in also in already deployed mesh networks. In this way clients on board of the attached hosts can see, from the mDNS perspective, the whole OLSR mesh as a single collision domain, permitting the use of applications that announce themselves through the mDNS mechanism.
Some olsrd plug-ins, like the txtinfo and httpinfo plug-ins, offer on demand, using a simple TCP listening socket, information regarding the OLSR mesh. The mDNS plug-in can be modified to offer in an analogous way information regarding the services announced in the mDNS packets that traverse, in the form of OLSR mDNS messages, the node where the plug-in is installed. This modification requires the implementation of an mDNS parser and of a listening server providing information on the mDNS services, in a format to be defined (XML, for instance).
On the side of the Freimap software, the idea is to have a context menu that permits interaction with a selected node. When a Freimap user requests information on active services, the selected node is polled (using the mechanism described above) and the information is presented to the user.
XMPP PubSub (JoTi)
tbd.
OpenIMP Collector (Robert)
tbd.
Distributed Measurement
tbd.
Service Control for freimap (Sagie)
Synopsis
This project aims to enhance the Freimap mesh network analysis tool by adding functionality to actively interact with the network being monitored. Another goal is to port Freimap to the Eclipse platform. This will allow adopting a model driven development model through the use of the Graphical Modeling Framework.
Network interaction will enable remote command execution on mesh nodes & will be available through context driven menus. Design will focus on modularity to enable future interaction forms to be built on-top of the remote execution module.
Design
A major goal of this project is to lay good foundations on which further functionality can later on be added. This requires a good understanding of what this tool is planned to do - will a 3D rendering engine eventually be introduced ? will network simulations require the ability to inject fake faults into the mesh network ? will nodes always be mapped by geo coordinates ? more immediate questions needing answering is SSH's capability to handle all remote interaction needs in terms of speed & functionality, determining if batch application of network settings on multiple selected nodes is required & setting catering for live data capturing.
As a starting point a more more clear 'multi-tier' design should be visible in the packaging system. This will be accomplished by adopting a model driven design & making the transition to the GMF framework. Several advantages will steam from the use of the GMF framework:
- Code generation. GMF uses an ecore model backend which supports code generation. Code for the Eclipse editor plug-in, diagram editor & some testing code is all automatically generated. UML diagrams can also be produced.
- Model driven design. This approach should scale very well with any future features added. Making changes later should also be relatively easy since they are usually achievable through the model.
- MVC archtechture - GMF by default separates between model & view components.
Porting Freimap to Eclipse will allow leveraging many advances features such as automatic updates, context driven popup menus, multiple view support, etc. relatively easy. The newly introduced SSH remote execution module will be designed in a way in which new remote functionality can be easily added ( ie. full command wrapping & execution through a dedicated thread. an example of such a command wrapper class is available in the Zenmap project ).
Multiple element selection (UI) should be expressed through design even if initially not used, with the intention of basing relevant context menu items on it. Another design objective is to be able to present real time charting in a way which would blend naturally into the UI design while avoiding a major performance penalty.
Sought Features & Implementation
[-] SSH and JSON/RPC communication channels with remote nodes
SSH interaction may use pssh, openssh or similar. Once setting SSH channels is available in the core, a select set of commands will be implemented including traceroute, ping, node restart, etc. Later the option to open an SSH console to a selected Freifunk node will be added to the UI (this will be introduced as a dockable widget pane, connectable nodes will have a special icon).
[-] visualization layers many useful visualization techniques are available for mesh network analysis - naming a few: - automatic node clustering based on zoom - varied edge opacity based on number of retransmissions / TX collisions. - 'Heat' colorizing of edges corresponding to signal-strength/avg. throughput/load vs. signal strength and so on... these could make mesh bottle necks stand out & point out unreliable network segments.
The 'VisorLayer' is a good start but would probable have to be expanded to support all the different possible filtering modes. Once a satisfiable layer representation is obtained the next step would be to implement a tree style selection pane with selection boxes to easily un/hide which visorlayers are displayed ( this would use the powerful GTK treemap model ). Ability to selectively pick data layers would also keep large scale scenarios clear.
[-] UI context driven menus while a single node is selected will include :
- SSH command invocation: traceroute, ping, etc. - open SSH console - display SNMP related data ( if available ) - list available services - monitor real-time statistics (dockable pane)
context driven menus while a single edge is selected will include :
- display statistics - simulate fault (long term ?) - sniff connection (long term ?)
see a UI mockup adopted from my last year's OLPC proposal
Animation control should probable reside in a dedicated animation toolbar.
Overall the UI won't present ground braking technology, but a special effort will be made to make the UI intuitive & clear. Integrating core functionality once tested should be relatively straightforward.
Development Methodology
Emphasis on Test-Driven development will be taken with the intention of preceding every feature implementation with the necessary tests to verify it's correctness. All functionality will be developed from the bottom up, ending at UI integration.
Further investigation should be made into the authentication model that is used for SSH connection setup - it could be very useful for willing participants to enable a some form of remote connection while researching / testing protocol performance.
Deliverables
- Initial port to the Eclipse GMF fremework
- three-tier layerization of current code base via UI, MODEL & Data code packaging
- remote node command channels via SSH
- context driven menus for single selected nodes / links
- context driven menus for multiple selected elements
- layer selection pane
- overall documentation of bulk of current code base & all additional code including UML class diagrams
- implement the tile cache ( optional )
- internationalization of UI ( optional )
I wouldn't want to over commit but I feel that touching database output issues may also be within the time scope of this project.
Road Map
- collect feedback from the Ferifunk community regarding proposed features & implementation approaches.
- grow familiar with openwrt & firmware code (until mid-end April)
- set up a test environment with several data sources (two weeks)
- transition to Eclipse
- implement the basic SSH command execution channel & basic command execution
- implement layering & filtering functionality
- integrate SSH capabilities into UI (beginning of July)
- mid-term evaluation : working remote command execution with UI integration (two weeks)
- layering UI integration (2 weeks)
- statistics UI integration (beginning of August)
- testing & Documentation of all features (2 weeks)
SNMP DataSource for freimap (Adnan)
SNMP is used in network management systems to monitor network-attached devices for conditions that warrant administrative attention. We can make use of this SNMP interface to freimap and monitor virtually any system value of a node in near real-time and react accordingly. The task would include to evaluate these available elements for inclusion in the visualization environment of freimap and implement the actual visualization. End of this project we will have the freimap tool with the visual network monitores information.
Effective Measurement / IPFIX Kernel Probe (Alexey)
Network measurement is an essential task for any network. It can be used for:
a) Find anomalous behaviour in traffic behavior b) Find 'bottlenecks' c) Billing purposes d) Vizualizing of network behavior e) etc etc
Network size and speeds are growing. Packet-based solutions doesn't fit well into modern network, as computational complexity keeps getting higher. Most effective solutions are flow-based (we define flow as unidirectional sequence of packets which shares some common properties). Problem here is effective generating of flows at high speed links (metering proccess). This problem is most essential for small or embeddable devices when we have limited CPU and memory resources. I propose new effective metering process for obtaining flows at high-speed links even in embeddable devices.
Usually metering process is located in user-space, it uses libpcap to capture packets (notable example is OpenIMP that was proposed at ideas page). These solutions are sub-optimal as we need to pass network packet from kernel space to user space (context switches; excessive copying). We even can lost packets using these way (had you used tcpdump at high-speed links?). There are solutions for effective capturing of packets like using zero-copy mechanism (e.g. PF_RING) but they are still sub-optimal even if we avoid massive copying there're still context switches and we need to pass packets from kernel space to user space even if non-required copying is not involved.
I propose new efficient metering process that is based on using Netfilter conntrack subsystem. This subsystem is already aggregates packets into flows (using L3 attributes which is more then enough).
Firstly, we extend conntrack subsystem to hold more needed information about connection which are not in L3 layer (input/output interfaces, nexthop etc).
Secondly, we listen for DESTROY events to export finished connection. Another problem here is temporary export. Assume we have long TCP session (e.g. somebody downloads DVD from some FTP server), if we will export information about this connection only when it finishes then we lose relevance of current state of the network. Imagine plot of network load for some IP address, if we wouldn't pass temporary information such connection then this plot will hold spikes (download finished; flow exported) but we really want to obtain smooth one. That is why temporary export is essential here.
Thirdly, we need some protocol to pass information about flows. NetFlow is the most popular one (invented by Cisco and used at its hardware) but this protocol is proprietary. There's fresh, open and flexible standard for exporting IP flow information known as IPFIX (RFC 5101, RFC 5102, RFC 5103...). It based on Netflow V9, provides deep information model, flexible templates, bidirectional flows (RFC 5103). Bidirectional flows is essential for connection tracking as we need to know information about both endpoints, moreover it's very useful for measurement purposes. Some example applications are:
- Separate traffic into "answered" and not
- Obtain RTT is trivial task (reverse flow begtime - forward flow begtime)
- Full reconstruction of TCP session
- With Netflow version 5 you can't even support IPv6..
IPFIX is very young protocol, its RFC comes from 2008. There're not much tools for measurement purposes (openIMP, CERT software, VERMONT). They are all pcap-based, doesn't support bidirectional flows (e.g. SiLK divides bidirectional flow into two unidirectionals).
Fourthly, we need to develop effective way to pass IPFIX-formated flows from kernel-space in effective way. We can use Relay subsystem here, which provides effective zero-copy mechanism for user- and kernel- space communication. As it's poll-based, it can be effectively transformed to event-driven using Netlink protocol.
Basically it goes like
Conntrack --(export)--> conntrack_to_ipfix --(ipfix packet)--> relay_write()
User-space process --(listen for netlink events)--> relay_buffer_to_stdout()
I was working on this project for 10 months already, I have implementation of this ideas which is proven to be way more effective then any other tools. Using such ideas is most essential for embeddable devices as I mentioned before. What has already been done:
- libipfix ported to Linux kernel. bidirectional flows support was added, memory footprint was reduced
- extending conntrack to Netflow V5 format (at least)
- relay/netlink communication mechanism implemented
- user-space tools are working
It works now for very specifical environment (our network) though I will need to work on generalization. Code of this project is available on request (it's open-source, but i just don't like to publish sofware not intended to use by end-users).
Deliverables
- IPFIX probe will be generalized and released as module.
- IPFIX probe will be ported to OpenWrt platform.
- Documentation will be written and it will be possible to configure
probe for end-user in easy way (e.g. web-interface)
- IPFIX support will be intergrated into freimap.
- Implementation will be tested using freimap and test framework that
will be provided with module.
Benefits
- Any openwrt-enabled deviced could act as IPFIX probe. It will be
reliable and highly effective. Even embeddable devices could rock on high-speed links. No overhead will be introduced as connection tracking subsystem is already there (e.g. it's required for NAT)
- Freimap will get IPFIX power, that will lead to great amount of of
information we could vizualize in convient way.
- IPFIX applications are non-countable really. E.g. you can use it
for example finding anomalous behaviour in network traffic using statistical analysis (e.g. Holt-Winters).
Existing Components
wprobe (Felix)
- Link documentation!