Wireshark: traces for PopCorn Hour

Now that i have a PopCorn on the network, I wanted to measure it’s impact on my local network:

  • 160 packets at boot within 1mn30

It mainly generates multicast DNS (mDNS) and SSDP. Stopping their own myihome protocol reduce a lot these broadcasts.

Surprise, my pc start to send request to PopCorn using UDP, every 30 seconds on port 427 (srvloc), but the PopCorn refuses through icmp unreachable:

TCPView shows up the process who listen on this port on my PC, but we only get the hosting process, svchost:

wireshark_popcorn_tcpview

Process Explorer ends up the mystery:

wireshark_popcorn_procexp

So it’s my HP driver that sends these requests every 30 seconds, even if this one answer with an icmp unreachable….HP never quit….Argh!

Here is a Wireshark filter to quickly find a PopCorn on the network:

http.server contains "Syabas myiBox

SCOM 2007 R2: monitor an Amazon EC2 Linux VM

What a strange idea, you may say…But having some VM in the cloud doesn’t meant they must be monitored apart. That said, it’s quite a challenge:

 

  • Management Packs from Microsoft don’t support Fedora
  • Authentication on VM Amazon are certificate based by default

When SCOM detect Fedora, it stop straight forward:

To get the Linux version, SCOM execute this shell script here:

By the way, this folder also contains RPM package installed next (daemon). Getting on our goal will need works…

To keep you on, here is one result when it works:

Performances SCOM pour une VM Amazon EC2

You got it, it’s about getting SCOM to believe we have a Red Hat. In fact we have 2 ways:

  • Create our own Management Pack for Fedora.
  • Disguise Fedora as if it was a Red Hat by fooling SCOM.

The first one is the more elegant, but it will take lot of time to get it and then to keep up. So i preferred the second one, as a Fedora is not that far from a Red Hat…

Changes on Linux Amazon EC2

Allow SSH connection without certificate

In /etc/ssh/sshd_config:

PasswordAuthentication yes PermitRootLogin no #Remove also the last line of the file that allow root login without password

Then you need to set a root password:

passwd root

Then reload configuration:

/etc/init.d/sshd reload

Then create a SCOM account:

useradd scom passwd scom

hostname check


By default, the VM got it’s internal IP as hostname. SCOM will deploy an RPM to get a daemon running, and will generate a certificate to authenticate.

It’s mandatory that the hostname match the name used by SCOM to reach it’s agent. The easiest is to change the hostname to a name reachable by SCOM, like you www.mydomain.com:
hostname www.lotp.fr
To keep after reboot, you need to update /etc/sysconfig/network by adding:
HOSTNAME=www.mydomain.com

DependencyThe RPM package from Microsoft depend at least on this 2 libraries:

  • /usr/lib/libssl.so.6
  • /usr/lib/libcrypto.so.6

If your system is more up to date (so.8 …), you will need to make symbolics links (ln -s target realfile):

If you don’t, you will have these errors messages when installing RPM (first libssl then libcrypto once the first is resolved):
At the end, i have the following symbolics links:
lrwxrwxrwx  1 root root       16 Feb 16 16:19 libssl.so.6 -> libssl.so.0.9.8k lrwxrwxrwx  1 root root       12 Feb 16 16:23 libcrypto.so.6 -> libcrypto.so

I recommend you to copy over the rpm (scx-1.0.4-248.rhel.5.x86.rpm), to check for the good installation result. Else, it will be SCOM that do it during discovery, and error messages are not that explicit.
Moreover, it shows up the name with which the certificate is made:

You can always remove it:

rpm -e scx-1.0.4-248.i386

TCP Ports

SCOM will connect by 2 ways:
  • SSH
  • port 1270 (scom agent)

You will have to set up the Amazon firewall to get these 2 ports reachable. If port 1270 is not reachable, SCOM raise an error about time out, which doesn’t help a lot. Traffic to this ports is confirmed by a network trace:

/etc/redhat-releaseBy default, Fedora create a symbolic link from /etc/fedora-release to redhat-release. You must break it and create a true redhat-relase file with this text inside::

rm /etc/redhat-release echo "Red Hat Enterprise Linux Server release 5" > /etc/redhat-release

Changes on SCOM 2007 R2

Tasks to do:

  • Allow WinRM to get basic authentication
  • Import Management Packs Linux/Unix generic and Red Hat
  • Add a Basic Authentication account to log on Linux (not root)
  • Add a Basic Authentication account to log on Linux (Root)
  • Link them to profiles “Unix Action Account” and “Unix Privileged Account”
  • Discover the VM and sign certificate

Adding the VM is straight forward if you followed the tasks before. So i will cover the problems i met.

Allow WinRM to do nasic authentication:

If you don’t have the good hostname, the certificate is refused when it’s time to sign it:

If you don’t add linux account to profiles, workflow don’t work:

At the beginning, the VM is not monitored and seen as unknown:

Then it switch to green and known:

Here we go, your VM is now monitored by SCOM 2007 R2.

Here are some screenshots once it works:

Amazon EC2: Very old Fedora image!

I wasn’t mistrustful enough at the beginning. One explanation is that i know Linux, but i never used Fedora before, so i didn’t know the actual version number….

How could i imagine that they provide a 2 years old image ? To remind, here is what they provide:

Version 8 is old enough that you can’t upgrade in one shot. You have to upgrade first to version 10 to then upgrade to version 11. I realized that trying to install nagios version 3 without being able to find it…

I found the way to upgrade through Amazon forums:

http://developer.amazonwebservices.com/connect/message.jspa?messageID=141707

In short:

Upgrade to version 10:

yum update -y yum clean all yum update -y yum clean all rpm -Uhv http://mirrors.kernel.org/fedora/releases/10/Fedora/i386/os/Packages/fedora-release-10-1.noarch.rpmhttp://mirrors.kernel.org/fedora/releases/10/Fedora/i386/os/Packages/fedora-release-notes-10.0.0-1.noarch.rpm yum clean all yum update -y yum clean all yum update -y yum clean all

Upgrade to version 11:

rpm -Uhv http://mirrors.kernel.org/fedora/releases/11/Fedora/i386/os/Packages/fedora-release-11-1.noarch.rpmhttp://mirrors.kernel.org/fedora/releases/11/Fedora/i386/os/Packages/fedora-release-notes-11.0.0-2.fc11.noarch.rpm yum clean all yum update -y yum clean all yum update -y yum clean all

I highly recommend to start by upgrading Fedora version before anything else. As i realized the problem close to the end of my setup, i had packages issues after upgrading, and even some packages i had to remove before upgrading due to dependencies problems!

Mysql

At least the flush privileges wasn’t working anymore:

mysql> flush privileges;
ERROR 1146 (42S02): Table 'mysql.servers' doesn't exist
I did a backup before upgrading, and this table wasn’t there anyway. I did a Mysql repair & upgrade:
mysqlcheck --all-databases --repair -u root -p mysql_upgrade -u root -p
Yum
Since the upgrade, calling it generate these messages, just before working anyway:
Loaded plugins: dellsysidplugin2, fastestmirror
ERR_OUT: : Bad address
ERR_OUT: : Bad address
ERR_OUT: : Bad address
ERR_OUT: : Bad address
ERR_OUT: : Bad address
I found the solution in this blog: http://d.hatena.ne.jp/const/20090909
It’s due to the smbios-utils package and smbios-utils-python, which helps to get bios informations. Since a virtual machine and i don’t have access to bios, i don’t care:
yum remove smbios-utils-python

WDS/MDT: enlever F12

  • By default, To boot from the network through WDS, you need:
    • set up the bios to boot on the network (F12 in many bios)
    • Once you get a dhcp lease, you have to quickly strike F12 to really boot from the network

    This second strike is a default safe option. If the boot order set the network before the hard drive, computers will try to boot from the network all the time. Most of the time, we just boot from the network to install OS, and then always boot from hard drive.

    But if you correctly set up your bios, the second F12 is uneeded. You just have to replace pxeboot.com by pxeboot.n12 to remove it:

    If you already have an important number of computers deployed, you can centrally configure their bios settings. Dell and HP provides central tools to set their bios remotely (generally through an executable that is deployed:

From dedicated server to Amazon EC2

I am on the way to migrate from a dedicated server (Dedibox, a french provider) to Amazon EC2. Here are my first feedbacks from Amazon Cloud..

Why i am migrating
My current provider is not allowing true virtualization on theirs servers. Their hardware is strong (Dedibox Pro products), but they prevent Hypervisor through their switches which shutdown port whith more than one mac address. So, without VMware ESXi or Hyper-V, i failback on VMware server on top of a CentOS.
VM are just slow, even with only 5 of them. So i have good hardware which i can’t truly use and so write off the cost. It’s not hazard, virtualization is the best and common way to use these 8GB of ram and quad core cpu.
At least another provider support ESXi, OVH. But they ask quite much money for that (15€/month to be allowed to have more IP…)
As my goal is to get Virtual machines at the end, Amazon EC2 looked quickly as a good choice. In the worth case, i will pay as much as now, but i will get what i pay for!

Step 1: Amazon calculator
Speaking about money, the first difficulty is to know actually how much it will cost! Well, to be honest, the fist one is to understand which instance is the right for me:

  • On demand: no commitment, pay as you use,
  • Reserved: Pay one shoot fee, and then pay as you use at a low cost,
  • Spot: you don’t know when you will get your VM, but it will cost less than On demand, but more than Reserved.

As you guess, 1 VM On demand used (powered on) all time costs much more than a Reserved one used all time.
Their calculator have an design issue. By default, it includes the one time fee in the first month free. The quick way is to multiply the numbers you see per 12, and then you get scared for a wrong reason!


So, to get the final cost per month, for one small reserved instance, always on:

(227.50$ + 29.98 * 12) / 12= 48.23$ § month

The average cost of a VM is 34€ per month. For this amount, you have a Linux 32 bit with:

  • CPU: 1 VCPU 1,7Ghz: 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
  • Memory: 1.7GB
  • Storage: 169GB

The weird thing is the 44$ of taxes on the bill (see later in this post)…

Step 2: Reserve an instance
Things are becoming serious! Before, i nevertheless made some benchmark with one On demand VM, to be sure about performance. We choose first the size of the VM (small in my case), where it will be hosted (2 sites in USA, 1 in Europe (Ireland). I chose Ireland, to save the latency over Atlantic. After 5 big clicks, we are pending. After 10 minutes, like on a Formula car race, we are active. Then ? Nothing happens ! No earthquake and no Virtual Machine in the list. Amazon Helps resources are nearly clueless. I finally understood that you have to create a On demand VM, in the same site as your reserved one. As simple as scary, because you don’t any word on the interface to comfort you are not going to pay 2 instances. You will have to believe in Amazon and their billing system. Well, it was ok for me, i didn’t pay twice!
So far, a reserved instance can only be a linux one. Windows is restricted to On demand, and it’s cost is higher than Linux, because the License comes with the VM.
The architecture choice (32/64 bit) depends on the size of the VM.

Step 3: create the VM
Now that i understood and still believe in Amazon, creating a VM is really simple, 5 clicks away and 5 minutes of waiting. You don’t install your own OS/ISO, but use a prepared image. They name them AMI (Amazon Machine Image). Amazon have 5 AMI, but the community extend the choice up to 971 AMI.

The storage for Linux is 10GB for system and 150GB for data. The VM see 2 partitions:
Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 10321208 1383500 8413420 15% / /dev/sda2 153899044 6793004 139288416 5% /mnt none 870472 0 870472 0% /dev/shm

When you create the VM and later on, Amazon provide a firewall outside the VM. You can then manage flow that come in. It’s handy, since we save an iptable:

Step 4: Log on the instance
Now that the VM is running, it’s time to logon. A right click on the instance shows up this menu:

The Connect link gives:

You have understood:

  • No console access (Get System Log shows you the boot log):
  • No root password, but instead a certificate generated when creating the VM)

On Windows, Putty is often used a free ssh client. But Putty can’t understand the Amazon certificate. Anyway, Amazon explains how to workaround it here:
http://docs.amazonwebservices.com/AmazonEC2/gsg/2007-01-19/putty.html

In short, you have to use PuttyGen, from the author of Putty, to convert the Certificate in a one Putty understand. After, it works great:

Step 5: Performances
This is the critical part when it comes to Virtual Machines. Good news, i am very happy with the performance:

  • Network: easily get 20Mb/s
  • Storage: 121MB/s in writing (dd if=/dev/zero of=dd.dd bs=1M)

Why does Amazon provide that performance ? Because they bill the network as you use it (0.170$ per GB of traffic). Unlike others hosting that bill all included, Amazon won’t loose money if a big user comes in. On my previous provider, i could use 100Mb/s non stop without paying more. So they share a lot connections to outside (peering), and so network is slow.

I prefer to pay a bit more when i use more, but have this quality. We can follow nearly in real time our usage and so next bill:

Twitter: où sont les pétites ?

Twitter Jusqu’à présent, j’ai jamais accroché au principe de twitter. Les informations que l’on y trouve sont très souvent aussi dans les blogs mais non structurées. Je pense cependant qu’il y a aussi des “pépites” éparpillées que l’on ne trouve pas dans les blogs ou ailleurs. Microsoft & Google doivent penser cela aussi car ils vont payer Twitter pour indexer son contenu
Pour donner une autre chance au microblogging, je me suis équipé de logiciels orientés twitter: TweetDeck sur le PC et l’iphone (gratuit). J’ai auss twitBird Premium sur iphone (pris pendant qu’il était gratuit). Il me semble mieux que la version TweetDeck, mais ce dernier à l’avantage de se synchroniser avec ce que je fais sur mon PC (à la Evernote).

Je pense aussi qu’on a tous une limite à la quantité d’information que l’on peut ingérer à la fois. J’ai déjà 155 suscriptions sur Google Reader, qui demandent déjà pas mal de temps pour absorber le lot d’information quotidien. Un petit tour dans la fonction Trends de Google Reader, m’indique que j’ai lu 3 027 posts en 30 jours, ce qui fait tout de même 100 posts par jour. Ce nombre est notamment dû à 3 sources qui font plus de 500 posts chacunes (forum scom, Journal du geek, PC inpact). Pour vraiment donner une chance à Twitter, il faut que je lui laisse une place dans la quantité d’information que je peux absorber.

Le forum SCOM est le flot le plus important. Je le suis car il y a quelques pétites de temps en temps qui me semblaient en valoir l’effort. Ne faisant cependant pas du SCOM tous les jours, il me semble le candidat idéal à remplacer par Twitter!

Cela me rappel lorsque je suis passé d’un dossier firefox avec tous les blogs que je suivais à Google Reader. Tous les jours, j’ouvrais d’un coup tous les blogs, et je les fermais à coups de ctrl+W si je connaissais déjà le premier post. Cela prenait du temps pour finalement peu de nouveau post (certains blog, y compris le mien on une activité faible). J’en suivais environ 75 de cette manière. En passant sur Google Reader, je peux en suivre 155, dont certains très actifs, car il s’agit en fait de news, qui sont relayées par beaucoup de blogs (je ne comprends pas la valeur de mettre sur son blog que telle version d’un soft est sortie, alors que plein de sites spécialisés relayent déjà l’info, mais bon…). Moyennant de faire de la place à Twitter, je pense pouvoir franchir une autre marche dans la quantité d’info que je peux ingérer. Le fait de pouvoir digérer une partie pendant les temps morts (transports, attentes…) via l’iphone facilite grandement la tâche.

J’ai donc crée un compte twitter (http://twitter.com/mathieuchateau). L’import de contacts pour trouver des gens à suivre est limité. On ne peut pas fournir un fichier csv ou autre, et l’import depuis Gmail ne fonctionne pas pour un compte Google Apps. J’ai dû exporter et importer dans un compte gmail standard pour en obtenir la substance moelle. 14 contacts utilisant twitter sur 426, c’est assez peu! Pas mal de sites confirmes que twitter se développe peu en europe et notamment en France, mes chiffres ne les contredisent pas.

Qu’aimeriez-vous trouver sur Twitter ? Pensez-vous me suivre sur Twitter ? Avez-vous des pétites (followers) que vous acceptez de partager ?

Windows Server 2008 R2 on my mac book air!

This post is following a previous one I have a Mac..How bad is it doctor?
I installed Windows 7 RC, which was working great. Nevertheless, i needed to run a Windows Server 2008 R2 with hyper-V. As this one can’t run inside a VM, i decided to install it on my MacBook Air.
The steps are easy but there is one critical event to not miss…Here we go:
Copy the BootCamp folder from MacOSX DVD to a usb key, this will be mandatory later!

Start under MacOSX. Through the bootcamp application, make hard drive space to Windows.
Boot on the Windows Server 2008 R2 DVD, and make a normal install.
As the first logon, you must immediately install the bootcamp drivers through the usb key (the external dvd drive is not yet working from windows). Fire up the msi directly. If you don’t and restart or logoff, you won’t be able to logon again, because the apple keyboard is not installed and so ctrl+alt+del won’t work!
I had to connect an external usb keyboard to do it !

Then you just have to make Windows Server 2008 R2 looks like Windows 7 through powershell:
import-module servermanager
Add-WindowsFeature Desktop-Experience
Add-WindowsFeature Wireless-Networking
Add-WindowsFeature Net-Framework
Add-WindowsFeature Telnet-Client
Set-Service -name Themes -startuptype Automatic
shutdown /r /t 15 /c “reboot to make themes service working”

Then you just have to activate the Windows 7 theme.

Installing Hyper-V prevent the laptop from hibernate:
Sleep and hibernate power features are not available when you enable Hyper-V technology on a Windows Server 2008-based portable computer

So i added a boot option to load Windows without Hyper-V:
bcdedit /copy {current} /d “Microsoft Windows Server 2008 – without hypervisor”
bcdedit /set {GUID of the previous command} hypervisorlaunchtype off

SharePoint 2007 slowness

I recently had to solve 2 SharePoint 2007 slowness:

  • The next call following application pool recycle : 2 minutes to get the page
  • When searching for people in AD: a least 30 seconds

2 minutes to get the page

IIS tries to contact crl.microsoft.com through http, but can’t. This is to verify assembly’s signatures in the GAC. Causes and solutions are explained on this blog:

http://www.muhimbi.com/blog/2009/04/new-approach-to-solve-sharepoints.html

I chose the following one:

Here we go, dropping load time to 20 seconds, which is “normal” since it has to compile again. But you can still go further by:

  • Modify when the application pool is recycled
  • Use SPWakeUp, to “warm up” the SharePoint engine by calling every site once

Search in AD : at least 30 seconds

When: you try to authorize a user, a group or just assign a task to someone. This takes around 30 seconds instead of one. I am targeting the time needed between the click and having the name underligned.

Causes: In my case, tracing the network activity of the server, i realized it was failing to contact domain controllers of another forest when i checked a user name. This try is due to the add of another forest in peoplepicker’s property, through stsadm. By default, SharePoint only checks users in it’s own domain. To extend the search to others domains or forest, you need to add them through this command:
stsadm -o setproperty -url http://SharePointSite:85 -pn peoplepicker-searchadforests –pv “domain1.com”,,;”domain2.com”,,

If you have trust between the SharePoint’s domain and the targeted one, you don’t have to provide credentials.

Here is a more detailled post: http://www.gk.id.au/2009/04/people-picker-sharepoint-and-forest.html

Solution: Allows SharePoint to connect to the domain controllers of these targets. The port to open is ldap (389) both in TCP and UDP.

That’s it, it nows takes again around one second and we can now search people in the other forest!

All you need to know on NLB

NLB (Network Load Balacing) from Microsoft have the advantage to come directly through the OS. As its name state, it allows to spread the load among many nodes, that are members of the farm (cluster). It’s quick & easy to set up, or it looks like so, but there are many things to check if you want it to be more than appearing to work…

Network impact

NLB can works in two modes:

  • Unicast
  • Multicast (with or without IGMP)

Which one to pick ? It Depends! Things that make choose one:

  • Which application will be used through the farm ? Does it support both mode ? For example, ISA 2006 only supported unicast until Service Pack 1 (a hotfix was available but not so famous)
  • How many network cards do the nodes have ? Unicast will require 2 interfaces minimum to respect best practice.
  • Do the nodes need to communicate between them ?
  • Is the multicast filtering activated on the switches ? It prevent flooding the network
  • Some switches (Cisco as example) do not stand at all to see the same mac address on the network from each node. You then have to convert your switch to hub, sending all packets to all farm members.

Monitoring & availability

It is true that if one node goes out of the network, the others will take its load over. But it’s a full failure. If you have 2 nodes, and just stop your business application on one node, NLB will still send clients to it, and so you just lost half of your customers. NMB is layer 3 (IP), and so isn’t aware at all of anything upper this layer. Even if the TCP port is not listened anymore. That’s the pitfall of NLB. Microsoft included sentinel in the resource kit. It allowed to test a web page on each node and push it out of the farm if it’s not working. ISA 2006 manage directly NLB, and can push a node out if ISA goes mad. So it’s your duty to fill the gap. If it’s a web site running through IIS, you can change a key in the metabase,LoadBalancerCapabilities to replace the 503 per a TCP reset. So the client will reconnect and send again its request, on another node.

To fill this gap, you can use your monitoring solution or a script looping on each node. The goal is to test each node from the application point of view, and push it out in case of error. Appliance load balancer (Alteon…) do the same, the industrial way. What you must take care:

  • You must check nodes as often as possible, but without overloading them. The best is to include this monitoring need in the application, by including a special web page that will test for us the applications compponents (database access…) and then back the result through a code.
  • Your monitoring becomes “active” (acting directly on the production by its own)

The Microsoft monitoring, SCOM, is interesting since you can act on trigger (eventlog, files…)

NLB versus MSCS ?

an MSCS cluster is meant to be active/passive. At anytime, resources are owned by only one node, which must be able to handle the full load. The good things is it can manage data, which are shared accross nodes and it monitors resources (state of windows services..). There again, it doesn’t cover all case, especially when the application is there, but not answering anymore requests (database access lost…).

Other solution ?

  • I already set up Safekit from Evidian on Windows. Not bad, but applications checks are still for you (how could it be the other way ?)
  • Load balancer appliances (F5;Alteon…). As great as expensive…
  • Keep with only one node ?

KB/Articles:

IIS Responses to Load-Balanced Application Pool Behaviors

NLB Operations Affect All Network Adapters on the Server

Unicast NLB nodes cannot communicate over an NLB-enabled network adaptor in Windows Server 2003

The “NLB troubleshooting overview for Windows Server 2003″ article is available

How to deploy a Secure Socket Tunneling Protocol (SSTP)-based VPN server that uses Network Load Balancing (NLB) in Windows Server 2008

An update enables multicast operations for ISA Server integrated NLB

Windows Server 2008 Hyper-V virtual machines generate a Stop error when NLB is configured or when the NLB cluster does not converge as expected

Terminal Services Client Cannot Connect to NLB Cluster TCP/IP Address

The NLB WMI Provider Generates a Lot of Error Entries in the Wbemcore.log File

How NLB Hosts Converge When Connected to a Layer 2 Switch

Windows Server 2003-based NLB nodes in an NLB cluster cannot communicate with each other over an NLB network adapter

Servers in a Network Load Balancing (NLB) failover cluster cannot be used as print servers in Windows Server 2008

Network Load Balancing (NLB) clients cannot connect to the Windows Server 2008 NLB cluster by using the virtual IP address when NLB is running in multicast mode

The virtual IP address of a Windows Server 2008 NLB cluster is bound to the NetBIOS host name of a particular server or of multiple servers