Evgeny Pokhilko's Weblog

OpenGL hardware acceleration through remote X11 SSH connection

Overview

This article is about running an OpenGL application on a remote server and using the local graphics card for rendering. It can help you in different rare scenarios: 1. You have a server in your network and it has some software with graphical output. You want to use that software remotely on many computers in the network without installing it on each machine individually. 2. Your application needs specific hardware configuration or resources that are only available on the server. 3. You experiment with remote OpenGL rendering for some reason.

For demonstration we are going to play chromium-bsu running on a remote server using the local graphics card.

Limitations

You should be aware of the security risk of opening X server for hardware rendering. The ssh server gets more control of the client with inderict rendering enabled.

My environment

I have a server with OpenSUSE Tumbleweed 13.2. It runs in headless mode. I also have a laptop with Lubuntu 15.10 wily that connects to the server through SSH. I will call these machines OpenSUSE and Lubuntu.

Terminology

You need to be familiar with general X11 client / server and SSH concepts. Counter intuitively, in my environment the laptop acts as X server and OpenSUSE Tumbleweed as X client. At the same time, Lubuntu is ssh client and OpenSUSE is ssh server.

Direct rendering – the OpenGL application sends instructions directly to the local hardware bypassing the target X server. This is only possible with single machine. We cannot use it in our scenario.

Software rendering – software rendering engine is used without leveraging the graphics card. It can be slow.

Indirect rendering – The remote application sends instructions to the X server which transfers them to the graphics card. This is what we are going to demonstrate in this article.

Before we start

This article will not explain known topics. We assume that SSH is already configured and all packages and drivers are installed. You have a private key for ssh to connect to the server and the server authorized_keys contains the corresponding public key.

Configure OpenSUSE

We need to enable X11 forwarding in the sshd configuration file:

Open /etc/ssh/sshd_config in your favorite text editor.
Add or uncomment X11Forwarding yes
Restart your sshd daemon
service sshd restart
Install the game
zypper install chromium-bsu

Nothing else needs to be done on OpenSUSE.

Configure Lubuntu

Enable trusted forwarding. Open or create ~/.ssh/config
Host ForwardX11Trusted yes

Trusted forwarding is needed to enable 3D hardware rendering.

Enable indirect redering. Open /etc/lightdm/lightdm.conf

Add or uncomment:

xserver-command=X +iglx

The command enables iglx, which I believe is indirect glx.

Restart lightdm or reboot the X server
systemctl restart lightdm.service

Run Chromium B.S.U

Connect to the SSH server.
Run chromium-bsu with the environment variables below:
LIBGL_ALWAYS_INDIRECT=1 chromium-bsu

You can also enable debug output for LIBGL with LIBGL_DEBUG=verbose chromium-bsu

The game is already playable. I get 30 FPS.

Make it even faster

To improve the performance further we need to enable direct TCP connection to the X server rather than going through SSH. Please note that this represents a security vulnerability and disabled by default. You can only use in a trusted local network.

Open /etc/lightdm/lightdm.conf

Add or uncomment:

xserver-allow-tcp=true
xserver-command=X -listen tcp +iglx

Open /etc/xinit/xserverrc

Commend out -nolisten tcp and another line as follows:

# exec /usr/bin/X -nolisten tcp "$@"
exec /usr/bin/X "$@"

Enable X11 connection to all clients (very unsafe!):
xhost +

Run Chromium B.S.U. with direct TCP connection

LIBGL_ALWAYS_INDIRECT=1 DISPLAY=<client-tcp>:0 chromium-bsu

This mode gave me the best possible performance. I could not distinguish between the game running locally and remotely from the server.

Troubleshooting and diagnostics

Use LIBGL_DEBUG=verbose environment variable to see warning messages. I get the following messages when the application falls back to the software rendering:

libGL error: failed to authenticate magic 1
libGL error: failed to load driver: r600
libGL: OpenDriver: trying /usr/lib64/dri/tls/swrast_dri.so
libGL: OpenDriver: trying /usr/lib64/dri/swrast_dri.so

You can also see the following message when X server does not allow iglx (+iglx option):

AIGLX: Screen 0 is not DRI capable

March 4, 2017 Posted by evpo | Linux, Networking, SUSE | Leave a comment

GDB: How do I set current source file for list and break commands

People who start using GDB after GUI debuggers often ask this question in the title. It is not obvious from the documentation and it is not googled easily either. So the trick I use is to specify a function that I want to debug like this.

list GetName

I can type Get and press tab. GDB suggests names. The command above will set the source file with GetName as its current source file.

If I don’t know the function name, I can specify the file name and line number directly.

list document.cpp:1

This is what most people look for. Then list command will output document.cpp and break 10 will set a breakpoint in document.cpp.

May 17, 2015 Posted by evpo | C++ | Leave a comment

How To Create and Seed a Torrent (Ubuntu server, Transmission)

Did you ever need to send somebody large files? This manual provides steps to create new torrents and seed files. The environment is Ubuntu server 12.04 and the torrent client is Transmission 2.51. Any compatible torrent client and OS can be used by the receiver.

There was a solution that was very helpful for me to create this manual. I made a few improvements in my opinion. You won’t need to create and configure a watch directory and copy your torrent files into it. I think I simplified that solution and made mine more inline with the Transmission default configuration.

We can begin now.

1. SSH to your server and install transmission-daemon if you don’t have it already.

The service is a headless torrent client that can be controlled through command line, web interface or possibly another GUI front end. We are going to use transmission-remote (see below).

sudo apt-get install transmission-daemon

The command will install and start the Transmission service.

2. Configure Transmission to be able to control it from the command line.

After the service is installed the security settings are too strict. It’s not possible to control it even from the command line. Issuing “transmission-remote -l” results in a permission error.

Stop the service with the following command:

sudo service transmission-daemon stop

The Transmission settings file in my environment is located under /var/lib/transmission-daemon/info/settings.json. If it’s not there, start the service and look at the parameters. “ps -Af | grep trans” outputs “/usr/bin/transmission-daemon –config-dir /var/lib/transmission-daemon/info”. info is the directory where settings.json is read from.

Open the settings file in vi or nano.

sudo vi /var/lib/transmission-daemon/info/settings.json

Find rpc-whitelist and rpc-whitelist-enabled. Modify them as follows:

"rpc-whitelist": "127.0.0.1",
"rpc-whitelist-enabled": true,

These settings are also very strict but they are enough for our task. Now you can control Transmission from the same machine and nowhere else. You may consider changing them later if you want to use the web interface.

While your are still in the text editor, note the location of the Transmission downloads directory. You will need it later. This is what I have:

"download-dir": "/var/lib/transmission-daemon/downloads"

Start the service again:

sudo service transmission-daemon start

To verify that you have done everything correctly invoke this command:

transmission-remote -l

It will print a table listing your active torrents. You don’t have any at this stage so the table should be empty (see “transmission-remote –help” for the description of -l and other commands). You can use this command later to see what the service is transferring.

3. Create a bash script file with the output of the cat command below.
It will automate the process of seeding torrents in the future. I called the script seedfile.sh

cat ~/bin/seed_file.sh

#!/bin/bash
TRANSDIR=/var/lib/transmission-daemon
cp -p $1 $TRANSDIR/downloads/
transmission-create $TRANSDIR/downloads/$1 -t udp://tracker.openbittorrent.com:80 -o $HOME/$1.torrent
transmission-remote --add $HOME/$1.torrent

Modify TRANSDIR variable if needed. $TRANSDIR/downloads should be the directory you noted at step 2. It’s the directory where Transmission downloads files to.

You can call this script as below:

~/bin/seed_file.sh file_I_want_to_send.zip

The script copies the file to the downloads directory (make sure that the file you are sending has enough permissions to be read by debian-trasmission user). Then the torrent file is created in $HOME. You can send the torrent to the people receiving the zip file. Finally we add the torrent to our service. When the service receives the torrent, it will know that the zip file already exists in the downloads directory (see the cp command) and it will skip to the Done state as if the file has been downloaded. The seeding will start at this point. Download the torrent file from the server and distribute it to the people on the receiving end.

A few notes on the tracker. You can see the transmission-create -t parameter in the script. The tracker is needed to coordinate peers before they are connected. Once the connection is established it’s not used. You can use any available tracker of choice.

Hope you find this manual useful.

January 11, 2015 Posted by evpo | Linux, Torrent | linux, Torrent | 2 Comments

GIT TF: Undo shallow pull and pull squashed changeset

Continue reading →

November 2, 2014 Posted by evpo | Source Control | git, git-tf, tfs | Leave a comment

Lynx on Windows 7 and lynx_bookmarks.html file problem

Issue:

I can save a bookmark with “A” command but I cannot open the bookmarks page with “V” command. It says

“document directory is not readable”.

Investigation:

Running Process Monitor revealed that Lynx is trying to save lynx_bookmarks.html file in %USERPROFILE%\My Documents and that directory is a junction to %USERPROFILE%\Documents. Probably Lynx is confused with this and cannot read the file from there.

Solution:

Open lynx.cfg and replace

DEFAULT_BOOKMARK_FILE:lynx_bookmarks.html

with

DEFAULT_BOOKMARK_FILE:../lynx_bookmarks.html

October 11, 2014 Posted by evpo | Uncategorized | Leave a comment

Old MacBook Overheating and Installation of Mac OS 10.4 on New Hard Drives

I have an old white MacBook Pro that I bought in 2007. The specs were the following:

13 Inch
Intel Core 2 Duo 2.0 GHz
1 GB RAM, 80 GB
Mac OS 10.4.7

Then I upgraded the OS to 10.5 Leopard and replaced the hard drive with 320GB Western Digital.

I want to talk about solutions to issues I had with my MBP: bad performance and installation of 10.4 Tiger on a new large (more than 120 GB) hard drive.

Problem 1. Bad performance and overheating.

The laptop became slow.
The fan was noisy with more than 6000 rps.
CPU reached 80-90 C
Occasionally force shutdown prompt after which you have to hold the power button for 5 seconds

After doing the research in the internet I found that my MBP didn’t support Mac OS 10.5 Leopard very well. The OS was too demanding for resources and my old laptop overheated. Upgrading to 10.6 Snow Leopard improved the performance a bit but it didn’t solve the problem completely. The solution was to go back to Mac OS 10.4 Tiger. Obviously 10.4 was not so resource hungry and the laptop was designed for it. Now we come to the next issue.

Problem 2. Mac OS 10.4.7 Installation disks don’t work with large hard drives.

I had two partitions on my MBP that I could switch with Boot Camp: Mac OS 10.6 and Windows 7. I decided to replace Windows 7 partition with Mac OS 10.4. I wanted to have both Tiger and Snow Leopard on my laptop to have a portable zoo. I needed 10.6 for new Mac OS apps that didn’t work with 10.4 and I needed 10.4 because of problem 1 above.

When installing 10.4 Tiger I had another problem. My hard drive 320 GB Western Digital was not recognized correctly by the installer. In the disk tools I saw that my hard drive was 7 TB and I had IO errors when I tried to do anything with it.

Here is the solution:

I connected a small USB external hard drive 80 GB Western Digital and installed 10.4.7 Tiger on that hard drive.
Booted from the USB drive and upgraded Mac OS to 10.4.11, which supports large hard drives.
Booted from the 10.6 Snow Leopard installation disk and used its Backup/Restore tools to backup the 10.4.11 from the external USB drive and restore the image to my former Windows 7 partition.

Happy ending

After having worked in Tiger for some time I noticed that my CPU doesn’t go above 70 C often, the performance has improved and the fan noise has almost disappeared. I had issues with Safari opening some web sites. I updated Safari and Java and now the web sites are operational. I also had to install an old version of Skype 2.8.0.866 that supported 10.4 from here (http://www.oldapps.com/mac/skype.php?old_skype=37). Skype audio works but if you want to use video, you need to have an old Skype on the other side two.

March 18, 2014 Posted by evpo | Uncategorized | Apple, Mac OS 10.4, MacBook Pro | 1 Comment

Memory Alignment Of Structures and Classes in C++

Everything below was tested in Visual Studio 2012 Win32. The code for this post is here

Introduction

The memory to store a particular structure or object is split into blocks of determined size. This size is called alignment.

The C++ standard requires that members are stored in memory in the order of declaration. Multiple consecutive members can be “packed” in one block. If the next member doesn’t fit into the remained bytes of the block, it’s saved into the following block. Those unused bytes in the previous block are called “padding”.

Normally the alignment is determined by the compiler for each structure type. In a simple scenario where the structure doesn’t contain other structures, the alignment is the size of the largest type stored in the structure. However, in some environments a simple type can have alignment requirement that is smaller than the size of the type. For example double can have alignment of 4. So it would be better to say that alignment of a structure is the maximum alignment of its contained types. Alignment is only allowed to be power of two by the standard.

Examples:

struct Alignment4
{
	char a;
	char b; 
	//char[2] - padding
	int i; // max alignment 4
Alignment4():a(1),b(2),i(3){} 
}; // size is 8

Alignment requirement for int is 4 and char 1. The maximum is 4 so 4 will be the alignment of Alignment4 structure. sizeof(Alignment4)==8

Memory: 01 02 cd cd 03 00 00 00

cd is the special value used for padding in this case (compiled in debug configuration). (Note that 03 is the first byte of the integer indicating that we see little endian – http://en.wikipedia.org/wiki/Endianness).

struct Alignment8
{
	int i; 
	//char[4]
	double d; // max alignment 8
	short s;
	//char[6]
	Alignment8():i(1), d(2), s(3){}
}; // size is 24

Alignments of the types are the following: int 4, double 8, short 2. The maximum is 8 so the alignment of the structure is 8.

Memory: 01 00 00 00 cd cd cd cd 00 00 00 00 00 00 00 40 03 00 cd cd cd cd cd cd

Note: 00 00 00 00 00 00 00 04 is how the double value of 2 is represented in memory.

The compiler outputs information about padding as below when /Wall (WarningLevel – EnableAllWarnings in C/C++ => General) switch is used:

1>size_test.cpp(16): warning C4820: ‘Alignment4’ : ‘2’ bytes padding added after data member ‘Alignment4::b’

1>size_test.cpp(24): warning C4820: ‘Alignment8’ : ‘4’ bytes padding added after data member ‘Alignment8::i’

1>size_test.cpp(28): warning C4820: ‘Alignment8’ : ‘6’ bytes padding added after data member ‘Alignment8::s’

1>size_test.cpp(35): warning C4820: ‘BetterAlignment8’ : ‘2’ bytes padding added after data member ‘BetterAlignment8::s’

1>size_test.cpp(43): warning C4820: ‘Size8’ : ‘6’ bytes padding added after data member ‘Size8::b’

1>size_test.cpp(46): warning C4820: ‘Size8’ : ‘7’ bytes padding added after data member ‘Size8::c’

1>size_test.cpp(68): warning C4820: ‘Size6’ : ‘3’ bytes padding added after data member ‘Size6::a’

Optimization

We only need 14 bytes to store Alignment8 but it takes 24. The order of the members is the problem. The block is 8 bytes and when we pack d, it has to start from the next block as there is only 4 bytes left in the current block, which is not enough for d. i and s can fit into one block but they are separated by d. BetterAlignment8 (see below) takes only 16 bytes.

struct BetterAlignment8
{
	int i;
	short s;
	//char[2]
	double d; // max alignment 8
	BetterAlignment8():i(1), d(2), s(3){}
}; // size is 16

Memory: 01 00 00 00 03 00 cd cd 00 00 00 00 00 00 00 40

This will reduce the memory consumption and possibly improve the performance as more elements will fit into the CPU cache.

Alignment control

There are several ways you can tell the compiler to change the default alignment:

Across the project /ZpN where N can be 1, 2, 4, 8, 16.

Specify alignment within code range of a module.

In this example the alignment of Size7 and Size5 will be 1.

#pragma pack(push, 1)
struct Size7
{
	char a;
	char b;
	double i;
	char c;
	S7():a(1), b(2), c(3), i(0){}
};

struct Size5
{
	char a;
	int i;
};
#pragma pack(pop)

Specify alignment per structure type.

struct __declspec(align(8)) Size6
{
	char a;
	int i;
	Size6():a(1), i(2){}
};

You can even control alignment per member http://msdn.microsoft.com/en-us/library/83ythb65.aspx.

Compiler oddities and limitations

In Visual Studio 2012 if you specify alignment that the compiler doesn’t accept in a particular case, it will be ignored without a warning. Everything below also has effect in Visual Studio 2012.

Alignment that is not power of two is ignored

#pragma pack(push, 3)

#pragma pack(pop)

Pragma pack instruction is ignored if the alignment is greater than the default alignment for the structure:

#pragma pack(push, 8)
// default alignment is 4. If default_alignment > pragma_alignment, pragma is ignored.
struct Alignment4
{
	char a;
	int i;
	Alignment4():a(1), i(2){}
}; // alignment is still 4
#pragma pack(pop)

__declspec instruction is opposite to #pragma. It is ignored if the alignment is less than the default alignment.

// if declspec_alignment <= default_alignment then the instruction is ignored
struct __declspec(align(2)) Size8a
{
char a;
	int i;
	Size8a():a(1), i(2){}
}; // alignment is still 4

Asserting alignment

If you have specific requirements for the alignment in your piece of code and you don’t know how it will be compiled in the future, it’s always better to assert it in the code. C++11 or TR1 are required.

#include <type_traits>
Struct MyStructure
{
	Char c;
	Int I;
};
Static_assert(std::alignment_of(MyStructure)::value == 4, “Alignment of MyStructure must be 4”);

If the alignment is not 4, there will be a compilation error.

Alignment conflicts

This is the reason why programmers start to learn about alignment most often. If you declare the same structure in two different modules with different alignment parameters, you will get a conflict. See the example below:

module01.cpp

struct Size5
{
	char a;
	int i;
};
Size5 global_size_5;

module02.cpp

#pragma pack(push, 1)
struct Size5
{
	char a;
	int i;
};
#pragma pack(pop)
extern Size5 global_size_5;

You will get this warning:

LINK : warning C4742: ‘struct Size5 global_size_5’ has different alignment in ‘D:\dhome\size_test\size_test.cpp’ and ‘D:\dhome\size_test\notepad.cpp’: 1 and 4

More about C4742 – http://msdn.microsoft.com/en-us/library/k334t9xx(v=vs.110).aspx

However, it doesn’t always happen. If you return a variable from another module by reference or pointer, there will be no warnings.

Conclusion

If the system you are working on demands high performance or low memory consumption, you may want to change the default alignment. It should be measured and tested of course. Sometimes you have to deal with alignment even when you don’t expect. For example the Microsoft code generator for COM proxies inserts alignment control directives. It can cause issues that you need to be prepared to understand. So alignment is worth keeping an eye on.

January 25, 2014 Posted by evpo | C++ | Leave a comment

Align label and input vertically

On an html form we usually have the following layout:

[Label][Input_TextBox]

[Input_CheckBox][Label]

By default a label can be a bit higher than the text box. The solution is to apply this CSS to both the elements:

.inputOrLabel {

display:inline-block;

vertical-align:middle;

}

Vertical alignment of blocks is difficult in other cases. See this article for more details.

June 12, 2013 Posted by evpo | ASP.NET, CSS | Leave a comment

Google Test Framework and Visual Studio 2010

Here is a few issues people get when they start using Google Test Framework with Visual Studio:

1. When an empty project is created with the default configuration in Visual Studio and you link it with gtest.lib, you get a lot of link errors like the following:

1>msvcprtd.lib(MSVCP100D.dll) : error LNK2005: “public: virtual __thiscall std::basic_iostream<char,struct std::char_traits<char> >::~basic_iostream<char,struct std::char_traits<char> >(void)” (??1?$basic_iostream@DU?$char_traits@D@std@@@std@@UAE@XZ) already defined in gtest.lib(gtest-all.obj)
1>msvcprtd.lib(MSVCP100D.dll) : error LNK2005: “public: virtual __thiscall std::basic_ios<char,struct std::char_traits<char> >::~basic_ios<char,struct std::char_traits<char> >(void)” (??1?$basic_ios@DU?$char_traits@D@std@@@std@@UAE@XZ) already defined in gtest.lib(gtest-all.obj)
1>msvcprtd.lib(MSVCP100D.dll) : error LNK2005: “public: __thiscall std::basic_iostream<char,struct std::char_traits<char> >::basic_iostream<char,struct std::char_traits<char> >(class std::basic_streambuf<char,struct std::char_traits<char> > *)” (??0?$basic_iostream@DU?$char_traits@D@std@@@std@@QAE@PAV?$basic_streambuf@DU?$char_traits@D@std@@@1@@Z) already defined in gtest.lib(gtest-all.obj)

This happens because the default compiler switch of gtest is /MTd and your empty project has /MDd. You need to make them consistent. One option is to change your test project and the tested project to /MTd.

2. You get the following errors when you mix your Release with gtest Debug and vice verse:

gtest.lib(gtest-all.obj) : error LNK2038: mismatch detected for ‘_ITERATOR_DEBUG_LEVEL’: value ‘0’ doesn’t match value ‘2’ in convex_hull_tests.obj
2>LIBCMT.lib(invarg.obj) : error LNK2005: __initp_misc_invarg already defined in LIBCMTD.lib(invarg.obj)
2>LIBCMT.lib(invarg.obj) : error LNK2005: __call_reportfault already defined in LIBCMTD.lib(invarg.obj)
2>LIBCMT.lib(invarg.obj) : error LNK2005: __set_invalid_parameter_handler already defined in LIBCMTD.lib(invarg.obj)
2>LIBCMT.lib(invarg.obj) : error LNK2005: __get_invalid_parameter_handler already defined in LIBCMTD.lib(invarg.obj)
2>LIBCMT.lib(invarg.obj) : error LNK2005: __invoke_watson already defined in LIBCMTD.lib(invarg.obj)

May 20, 2013 Posted by evpo | C++ | C++ UnitTest | Leave a comment

Convex Hull

I read about the problem of finding Convex Hull in Algorithm Design Manual and knew that the complexity should be N for points sorted by one coordinate. Before reading about existing solutions I determined to find my own. “Invent the wheel” in other words. I spent hours thinking about a solution. In the end I found one:

I have a collection containing convex hull points. It is an stl::list. It’s empty at the beginning. I loop through points sorted by x. So I move from left to right of the picture. At the end of processing each point I maintain the algorithm invariant in which the list contains the convex hull of the points processed so far. If you imagine this visually, it’s like putting a deflated balloon on an object from left to right.

When I process a point, I calculate if it’s on the top edge of the hull and then I do the same for the bottom edge. I also loop through the points that are already in my existing convex hull collection to see if they are not on the edge yet given the current processed point. If it’s not I remove it from the hull collection.

An important part of the algorithm is the function ConvexHull::compare_vectors. It takes three points and calculates if the point in the middle is on the edge of the convex hull. It represents the three points p1, p2 and p3 as two vectors p1 – p2 and p2 – p3. Then it prolongs p1 – p2 vector until it intersects with x = p3.x line. If y of the intersection point is higher than p3.y, the point is on the top edge of the convex hull (picture 1).

Picture 1

The algorithm is in a console application that takes an image file (Picture 2), draws the convex hull and saves it to another image file (Picture 3). It supports multiple image file formats thanks to the stbi_image library.

Picture 2

After experimenting with my rasterizing algorithm I used Anti-Grain Geometry by Maxim Shemanarev. See the result on Picture 4.

Picture 3

1942_Nash_Ambassador_X-ray(result)

Picture 4

You can find the sources on GitHub: https://github.com/evpo/ConvexHull

Binaries for windows are also available:

http://sourceforge.net/projects/convexhull/files/ (100% CLEAN award granted by Softpedia)

Download the zip file and run the cmd file to see it in action.

May 4, 2013 Posted by evpo | Algorithms, C++ | 1 Comment

« Previous Entries

Archives
- March 2017 (1)
- May 2015 (1)
- January 2015 (1)
- November 2014 (1)
- October 2014 (1)
- March 2014 (1)
- January 2014 (1)
- June 2013 (1)
- May 2013 (2)
- February 2012 (2)
- October 2010 (1)
- February 2010 (1)
Categories
- .NET
- Algorithms
- ASP.NET
- C++
- CSS
- Linux
- Networking
- Source Control
- SQL
- SUSE
- Torrent
- Uncategorized
- Workflow
- WPF
RSS
Entries RSS
Comments RSS