Sandro Posted November 27, 2015 Report Share Posted November 27, 2015 Hi uxcn :) So you've choiced clang/llvm as your "default compiler" ? excuse me for my question ..... why that choiche ? (For example i've experimented PC-BSD 11.X and is very slower vs 10.X (that use GCC as compiler)) mumble... mumble... Thank You for any suggestion :) I think that if you declare the "-march" , the "-mtuse" is redundant..... (not utile?) I see a -j16 .... WOW !!! :P :P :P your system is based on a dual Xeon (with 4 cores / 8 threads) both ? Thanks for suggestions :) Link to comment Share on other sites More sharing options...
uxcn Posted November 27, 2015 Report Share Posted November 27, 2015 So you've choiced clang/llvm as your "default compiler" ? excuse me for my question ..... why that choiche ? I switched to Clang as a default on my laptop when I had to rebuild it a while ago. It was mostly just an experiment, but I have noticed that Clang compiles a lot faster than GCC. Actually, it catches a number of optimizations GCC misses as well. I tried switching to LTO recently, and the binaries it emits are even cleaner. I haven't tried comparing to GCC 5.2 though. I think FreeBSD switched to Clang as a default recently too. I think that if you declare the "-march" , the "-mtuse" is redundant..... (not utile?) I think you're right, -mtune is completely redundant. It's harmless though, so I've just left it in the configs. I see a -j16 .... WOW !!! :P :P :P your system is based on a dual Xeon (with 4 cores / 8 threads) both ? It's a four core smt (i7-3632QM), but I use ccache and distcc too. The "cluster" I compile against adds another eight processors. I use a tmpfs for portage as well, which generally keeps things CPU bound. Thanks for suggestions :) No problem :). Link to comment Share on other sites More sharing options...
Sandro Posted November 27, 2015 Report Share Posted November 27, 2015 I know that about the time implied to compile an ebuild may be faster with llvm; but i'm interested not for compiling time, but about the binary code generated by a compiler; in Phoronix for example (excuse me but i've not a link at this moment) there are comparisons about gcc vs llvm/clang. also comared to ICC (Intel C Compiler) and also Open64 (AMD open source compiler, present since few time ago also in portage tree). The resultstold to me that gcc is the more performant compiler (about binary code generated) (not in all cases but considering the average....). But ... since clang is in evolution (but also gcc), the answers may be done looking the benchs :) Try to "take a look" if You want in Phoronix website :) Hi ... and hope not have bored You :) With Estimate. Sandro :) PS: Excuse me if i'm little "off topic" :|. May be Sympathic to open a new thread about various compiler performancies ... :) Ciao :D (from Italy) :) Link to comment Share on other sites More sharing options...
uxcn Posted November 28, 2015 Report Share Posted November 28, 2015 I haven't had any performance problems with Clang, particularly using LTO. On the theory side of things, I think Clang and GCC perform better for different things. I have noticed that Clang tends to do better constant folding and strength reductions in the generated code that I've looked at. That may have changed with more recent GCC releases though. If you're interested in performance, LTO with the same compiler will probably give a better performance increase than switching to another compiler. I know the binary Clang emits for vlc using LTO is more than 100KiB smaller than without LTO. It might be worth benchmarking. It's a few versions behind for Clang and GCC, but one of the better compiler comparisons I read was for compiling Firefox. GCC with LTO did generate a smaller and faster binary than Clang with LTO, but again it's a few versions behind for both compilers. Ciao :D (from Italy) :) Grazie... although my Italian is honestly worse than my Spanish :). Link to comment Share on other sites More sharing options...
Sandro Posted November 28, 2015 Report Share Posted November 28, 2015 I haven't had any performance problems with Clang, particularly using LTO. On the theory side of things, I think Clang and GCC perform better for different things. I have noticed that Clang tends to do better constant folding and strength reductions in the generated code that I've looked at. That may have changed with more recent GCC releases though. If you're interested in performance, LTO with the same compiler will probably give a better performance increase than switching to another compiler. I know the binary Clang emits for vlc using LTO is more than 100KiB smaller than without LTO. It might be worth benchmarking. It's a few versions behind for Clang and GCC, but one of the better compiler comparisons I read was for compiling Firefox. GCC with LTO did generate a smaller and faster binary than Clang with LTO, but again it's a few versions behind for both compilers. Grazie... although my Italian is honestly worse than my Spanish :). Oh this discussion take a relevant interesting (for me :P) ; I'm happy that You're a very kind man :); i'm a "little penguin"; but i want to learn as more as possible :) viewing this benchs ( http://www.phoronix.com/scan.php?page=article&item=clang-37-gcc52&num=1) it seems that llvm creates better binary code for some applications ...... I'd like to try with Your method ..... :) but there is also * dev-lang/icc Available versions: ~13.0.0.079^m ~13.0.1.117^m ~13.1.2.146^m ~13.1.3.163^m ~13.1.5.192^m ~14.0.0.080^m ~14.0.1.106^m ~14.0.2.144^m ~14.0.3.174^m ~15.0.0.090^m ~15.0.1.133^m ~15.0.2.164^m ~15.0.3.187^m {eclipse examples multilib LINGUAS="ja"} Homepage: http://software.intel.com/en-us/articles/intel-composer-xe/ Description: Intel C/C++ Compiler hehe i've "stoled" (copied in a text file) your make.conf (i wanna study it)... cause one of this days i want to make experience with llvm; than i will tell to you about some benchmarks that i can make on my system :) Only one question: about LDFLAGS, i've "LDFLAGS="${LDFLAGS} -Wl,--has-stile=gnu". so to have: ci74771ht ~ # emerge --info|grep LDFLAGS LDFLAGS="-Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,--hash-style=gnu" I can see that you've declared also the -march ..... it is necessary using llvm ? Excuse me for my questions .... Link to comment Share on other sites More sharing options...
uxcn Posted November 28, 2015 Report Share Posted November 28, 2015 I hoped the files might be helpful. You're definitely welcome to copy any of them, I'll try to answer any questions. but there is also * dev-lang/icc Available versions: ~13.0.0.079^m ~13.0.1.117^m ~13.1.2.146^m ~13.1.3.163^m ~13.1.5.192^m ~14.0.0.080^m ~14.0.1.106^m ~14.0.2.144^m ~14.0.3.174^m ~15.0.0.090^m ~15.0.1.133^m ~15.0.2.164^m ~15.0.3.187^m {eclipse examples multilib LINGUAS="ja"} Homepage: http://software.intel.com/en-us/articles/intel-composer-xe/ Description: Intel C/C++ Compiler ICC (Intel Compiler Collection) can also optimize extremely well, but it's proprietary (you need a license). If you're interested in seeing some of the differences between generated binaries, you might want to try benchmarking something like transcoding video with ffmpeg. Since transcoding is CPU intensive, it's a good way to get an idea of the performance of the code a compiler generates. I can see that you've declared also the -march ..... it is necessary using llvm ? Not just Clang/LLVM, it's for LTO (link-time optimization). Since optimization is done when the compiler links instead of when it compiles, the linker needs the optimization arguments. GCC with LTO is similar. Excuse me for my questions .... It's not a problem :). Link to comment Share on other sites More sharing options...
Sandro Posted November 28, 2015 Report Share Posted November 28, 2015 Thank You uxcn ... You're very very kind :) Good For All :) God Bless you :) Link to comment Share on other sites More sharing options...
coffnix Posted November 30, 2015 Report Share Posted November 30, 2015 FEATURES="distlocks metadata-transfer sandbox sfperms strict parallel-fetch buildpkg -ccache" USE="-gles vim-syntax fontconfig qt3support wps opengl gpm jpeg ogg truetype device-mapper lzma symlink openssl -systemd syslog kmod -bindist -bluetooth -gtk -qt4 qt5 X -ldap caps mmx sse sse2 python perl idn openipmi snmp apache tcl threads sasl bash-completion urandom threads python_abis_2.6 python_abis_2.7 python_abis_3.3 command-args -gcrypt -php5-3 -php5-5 -gnutls -static-libs cluster -curlwrappers curl geoip -jit lua clamav postgres -ptpax xattr tproxy vorbis dbus policykit xkb alsa -pulseaudio ios ntp -kde gallium -glamor udev xmp apng -egl evdev openmp -mysql chm djvu dpi ebook mobi fam nsplugins networkmanager samba cairo consolekit fuse wifi -espeak -udisks -teamd -handbook -upower pax_kernel exiv2 -mono dos rar kpathsea opencl clang xinerama -gles2" CPU_FLAGS_X86="mmx sse sse2 sse3 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt mmxext" MAKEOPTS="-j7" PHP_TARGETS="php5-4" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_3" PYTHON_ABIS="2.7 3.3" PORTDIR="/usr/portage" PKGDIR="/storage/portage/packages" PORT_LOGDIR="/var/log/portage" ACCEPT_LICENSE="*" LINGUAS="en pt_BR" VIDEO_CARDS="vesa radeon intel i915 i965" INPUT_DEVICES="evdev keyboard mouse synaptics" PAX_MARKINGS="XT" QEMU_SOFTMMU_TARGETS="i386 x86_64 arm mips mips64 mips64el mipsel" QEMU_USER_TARGETS="i386 x86_64 arm mips mips64 mips64el mipsel" ABI_X86="32 64" PORTDIR_OVERLAY="/var/overlay/local /var/overlay/python" foobashrc_modules = "localpatch" LOCALPATCH_OVERLAY="/etc/portage/localpatches" source /var/lib/layman/make.conf PORT_LOGDIR="/storage/portage/var-log-portage/" Link to comment Share on other sites More sharing options...
Sandro Posted February 9, 2016 Report Share Posted February 9, 2016 I hoped the files might be helpful. You're definitely welcome to copy any of them, I'll try to answer any questions. ICC (Intel Compiler Collection) can also optimize extremely well, but it's proprietary (you need a license). If you're interested in seeing some of the differences between generated binaries, you might want to try benchmarking something like transcoding video with ffmpeg. Since transcoding is CPU intensive, it's a good way to get an idea of the performance of the code a compiler generates. Not just Clang/LLVM, it's for LTO (link-time optimization). Since optimization is done when the compiler links instead of when it compiles, the linker needs the optimization arguments. GCC with LTO is similar. It's not a problem :). @uxcn: Escuse me , i have a little question. However i've addedd -march=native and -flto in the LDFLAGS But since -flto is an optimization for the linker, why has you declared also in C*FLAGS that flag ? isn't it redundant ? or if not which is the technical reason ? And -flto includes IPO or must be declared " -fwhole-program --combine" ? excuse me for the question and thanks very much :) Sandro 1 Link to comment Share on other sites More sharing options...
uxcn Posted February 15, 2016 Report Share Posted February 15, 2016 @uxcn: Escuse me , i have a little question. However i've addedd -march=native and -flto in the LDFLAGS But since -flto is an optimization for the linker, why has you declared also in C*FLAGS that flag ? isn't it redundant ? or if not which is the technical reason ? No worries :). There are actually a couple reasons why you want to add -flto at compile time (e.g. CFLAGS). With LTO, the compiler typically wants to use an intermediate format instead of its normal format to preserve information for optimization later on. Generally the compiler also wants to delay any optimizing until the link stage as well since an optimization might not make sense in the library or whole program context. And -flto includes IPO or must be declared " -fwhole-program --combine" ? When you use -flto the optimizer essentially assumes one large compilation unit at link time and can do IPO over that. However, since optimization is normally done per compilation unit, -fwhole-program is sometimes a necessary hint that the optimizer can consider the compilation unit the whole program. With LTO, it's unnecessary though and it would most likely pessimize the result. Hope this helps... Link to comment Share on other sites More sharing options...
AdiosKid Posted February 15, 2016 Report Share Posted February 15, 2016 ################################## ################################## ### ### ### Make Configuration ### ### ### ################################## ################################## CFLAGS="-march=core-avx2 -O2 -pipe" CXXFLAGS="${CFLAGS}" CHOST="x86_64-pc-linux-gnu" FEATURES="parallel-fetch collision-protect protect-owned buildpkg" ACCEPT_LICENSE="*" LINGUAS="pt_BR en" VIDEO_CARDS="fglrx intel i965" ALSA_CARDS="hda-intel" INPUT_DEVICES="evdev keyboard mouse synaptics" MAKEOPTS="-j5" ABI_X86="32 64" CPU_FLAGS_X86="aes avx fma3 mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" CORE="acl bindist bzip2 consolekit dbus evdev lzma policykit nls nptl symlink X zip zlib" SYSTEM="cairo cryptsetup device-mapper imagemagick introspection gpm gtk openmp python threads udisks usbredir xcb zsh-completion" PROCESSOR="postproc egl opengl vaapi vdpau xa xvmc" FONTS="truetype type1 cleartype corefonts" IMAGE="apng exif gif gphoto2 jpeg jpeg2k png svg tiff webp wmf" AUDIO="alsa cdio faac faad flac mp3 lame pulseaudio sdl sndfile wavpack" VIDEO="ffmpeg matroska theora v4l vpx x264 x265" WEBCAM="v4l" NETWORK="http ipv6 networkmanager -samba" WEB="apache2 berkdb curl fpm ftp gdbm dbm mysql mysqli pdo nsplugin nginx php readline sqlite" DEVICES="ieee1394 mtp usb touchpad" SECURE="crypt ssl" REMOVED="" USE="${CORE} ${SYSTEM} ${PROCESSOR} ${FONTS} ${IMAGE} ${AUDIO} ${VIDEO} ${WEBCAM} ${NETWORK} ${WEB} ${DEVICES} ${SECURE} ${REMOVED}" PORTAGE_TMPDIR="/tmp" PORTDIR="/usr/portage" DISTDIR="${PORTDIR}/distfiles" PKGDIR="${PORTDIR}/packages" GRUB_PLATFORMS="efi-64 pc" PYTHON_ABIS="2.7 3.3" PYTHON_SINGLE_TARGET="python3_3" PYTHON_TARGETS="python2_7 python3_3" RUBY_TARGETS="ruby20 ruby21 ruby22 ruby23" PHP_TARGETS="php5-5 php5-6 php7-0" NGINX_ADD_HTTP="fastcgi" Link to comment Share on other sites More sharing options...
Sandro Posted February 15, 2016 Report Share Posted February 15, 2016 @uxcn: Great explanation ..... I thank you very much great friend :) :) :) uxcn 1 Link to comment Share on other sites More sharing options...
gb00s Posted February 15, 2016 Report Share Posted February 15, 2016 CHOST="x86_64-pc-linux-gnu" CFLAGS="-march=westmere -O2 -pipe" CPU_FLAGS_x86="aes mmx mmxext popcnt sse sse2 sse3 ssse3 sse4_1 sse4_2" CXXFLAGS="${CFLAGS}" KERNEL="linux" #MAKEOPTS="-j24" #EMERGE_DEFAULT_OPTS="--with-bdeps=y --ask --ask-enter-invalid --verbose" MAKEOPTS="-j24 -l24" EMERGE_DEFAULT_OPTS="--with-bdeps=y --ask --ask-enter-invalid --verbose --jobs=24 --load-average=24" FEATURES="parallel-fetch collision-protect" ADD="acl alsa alsa-plugins bash consolekit dbus eudev threads udev unicode X" REMOVE="-gnome -introspection -kde -ldap -qt4 -qt5 -systemd -wayland" HW="nvidia bluetooth" CPU="aes mmx mmxext popcnt sse sse2 sse3 ssse3 sse4_1 sse4_2" USE="${ADD} ${REMOVE} ${HW} ${CPU} ${VIRT}" ACCEPT_KEYWORDS="~amd64" ABI_X86="32 64" VIDEO_CARDS="nvidia" INPUT_DEVICES="evdev" LINGUAS="en_GB" ACCEPT_LICENSE="* EULA" Link to comment Share on other sites More sharing options...
papu Posted October 25, 2016 Report Share Posted October 25, 2016 hi all , this is mine: CFLAGS="-march=native -O2 -pipe" CXXFLAGS="${CFLAGS}" CHOST="x86_64-pc-linux-gnu" MAKEOPTS="-j5 -l4" ABI_X86="64 32" ACCEPT_LICENSE="*" CPU_FLAGS_X86="aes avx mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" CURL_SSL="libressl" EMERGE_DEFAULT_OPTS="${EMERGE_DEFAULT_OPTS} --autounmask-write=y --complete-graph=y --color=y --load-average=4 --keep-going -v --verbose-conflicts --with-bdeps=y" FEATURES="${FEATURES} candy cgroup nodoc noinfo parallel-fetch parallel-install" DISTDIR="/mnt/sources/distfiles/" INPUT_DEVICES="evdev" L10N="ca" LINGUAS="ca" PKGDIR="/mnt/sources/packages/" RUBY_TARGETS="ruby22 ruby23" positive="cacert gstreamer libressl lzma lzo openal opencl openexr pulseaudio vdpau" negativa="-bluetooth -geolocation -gnome -gstreamer010 -handbook -ios -ipod -kde -qt3support -qt4 -wireless -xinerama" USE="${positive} ${negativa}" VIDEO_CARDS="radeon radeonsi" === Enabled Profiles: === arch: x86-64bit build: current subarch: intel64-ivybridge flavor: desktop mix-ins: kde-plasma-5 :rolleyes: Link to comment Share on other sites More sharing options...
Sandro Posted October 26, 2016 Report Share Posted October 26, 2016 There is a little error MAKEOPS="-j5 -l4" instead of MAKEOPTS="-j5 -l4" However ... i don't use the "-l" ; excuse my ignorance ... what can do the "-l4" ? Thx. Link to comment Share on other sites More sharing options...
papu Posted October 26, 2016 Report Share Posted October 26, 2016 There is a little error MAKEOPS="-j5 -l4" instead of MAKEOPTS="-j5 -l4" However ... i don't use the "-l" ; excuse my ignorance ... what can do the "-l4" ? Thx. oh yes very thanks, i had this bad write :( --load-average X.Y = -lx.y MAKEOPTS Use this variable if you want to use parallel make. For example, if you have a dual-processor system, set this variable to "-j2" or "-j3" for enhanced build perfor? mance with many packages. Suggested settings are between CPUs+1 and 2*CPUs+1. In order to avoid excess load, the --load-average option is recommended. For more information, see make(1). Also see emerge(1) for information about analogous --jobs and --load-average options. https://wiki.gentoo.org/wiki//etc/portage/make.conf#MAKEOPTS Sandro 1 Link to comment Share on other sites More sharing options...
Sandro Posted October 26, 2016 Report Share Posted October 26, 2016 Ok .... Thanx friend papu :) i've never used the "load average" :D Link to comment Share on other sites More sharing options...
papu Posted October 26, 2016 Report Share Posted October 26, 2016 Ok .... Thanx friend papu :) i've never used the "load average" :D may be not much important because i am not using it for 2 months due the type error , and I did not notice anything special hahaha. anyway compilations are now optimized for my 4 cpu cores and works like a charm Link to comment Share on other sites More sharing options...
Sandro Posted October 27, 2016 Report Share Posted October 27, 2016 papu :) :) :) very happy :P Link to comment Share on other sites More sharing options...
Havis Posted February 3, 2017 Report Share Posted February 3, 2017 This is make.conf used on my Skylake system (i7 6700k) You can use "app-portage/cpuid2cpuflags" to adjust CPU_FLAGS_X86 for other systems (run cpuinfo2cpuflags-x86) make.conf ala Havis: ABI_X86="64 32"CFLAGS="-march=broadwell -mclflushopt -mxsavec -mxsaves -O2 -pipe"CXXFLAGS="${CFLAGS}"CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3"FEATURES="buildpkg network-sandbox parallel-fetch sandbox userpriv usersandbox userfetch usersync"EMERGE_DEFAULT_OPTS="--ask --verbose --quiet-build=y --with-bdeps=y --backtrack=10"DISTDIR=/var/portage/distfilesPKGDIR=/var/portage/packagesMAKEOPTS="-j8"PORTAGE_NICENESS="19"CLEAN_DELAY="3"LINGUAS="*"VIDEO_CARDS="amdgpu vesa intel i915 i965 mga nv nouveau r100 r200 r300 r600 radeonsi radeon v4l qxl"ACCEPT_LICENSE="@FSF-APPROVED"CPU_USE="threads smp custom-cflags custom-optimization"PACKERS_USE="lzma lzo lz4 minizip"AUDIO_USE="alsa openal pulseaudio"AUDIO_CODECS_USE="opus fdk"VIDEO_CODECS_USE="libass vcd xv quicktime mp4"GFX_USE="djvu mng dvi xmp xpm"X_USE="glamor introspection sdl vdpau xa xvmc spice wayland gles xcb xkb"LIB_USE=""USE=" \ ${CPU_USE} \ ${PACKERS_USE} \ ${AUDIO_USE} \ ${AUDIO_CODECS_USE} \ ${VIDEO_CODECS_USE} \ ${GFX_USE} \ ${X_USE} \ ${LIB_USE} \ "QEMU_SOFTMMU_TARGETS="*"QEMU_USER_TARGETS="*"GRUB_PLATFORMS="efi-32 efi-64 pc" Link to comment Share on other sites More sharing options...
Recommended Posts