Uzun bir süredir, geliştirdiğimiz uygulamalar çok çekirdekli sistemler üzerinde çalışıyor. Ancak uygulamaları hala sanki tek işlemcili bir sistem üzerinde çalışacakmış gibi yazıyoruz. Bu durumda çok çekirdekli yapının bilgi işleme kapasitesinden yararlanamamış oluyoruz ve uygulamamız tek bir çekirdeği kullandığı için cevap süresi yüksek kalıyor. Artık bir çok platform çok çekirdekli mimariler üzerinde uygulama geliştirmeyi kolaylaştırmak için çözümler sunuyor. Java SE 6 platformunda Concurrency API'yi kullanarak çok iş parçacıklı uygulamalar geliştirmek mümkün olsa da Java SE'deki diğer standart API'ler (örneğin Collections, Networking, Security API) çok çekirdekli mimari göz önüne alınarak gerçeklenmedikleri için yazılımın çalışma zamanındaki başarımı kısıtlı ölçüde artış göstermektedir. Bu durum Java 7'deki Fork-Join ve özellikle 2014 yılında çıkması beklenen JDK 8'deki Paralel Torbalar ve Dönüştür/İndirge çatısı ile aşılacakmış gibi görünüyor. Diğer taraftan çok çekirdekli mimariler işlemciler ile kısıtlı değiller, örneğin grafik işlemciler de çok çekirdekli mimariye sahipler. Üstelik grafik işlemcilerde genel amaçlı işlemcilere göre çok daha fazla sayıda çekirdek bulunmaktadır. Java platformunda GPU programlamayı Java 9 ile birlikte gerçekleştiriyor olabiliriz.
OpenCL kütüphanesi hem merkezi işlemci birimi hem de grafik işlemciler üzerinde işlemcinin üreticisinden, mimarisinden ve türünden bağımsız olarak uygulama geliştirmemize olanak sağlayan bir çözüm olarak öne çıkmaktadır. OpenCL Khoronos Group tarafından geliştirilmektedir. Şu an 1.2 sürümü mevcuttur. OpenCL'de temel programlama dili C'dir. Java'dan OpenCL yordamlarına Java Native Interface (JNI) aracılığı ile erişim mümkündür. Bu yazıda OpenCL yordamlarına Java sınıflarından nasıl erişilebileceği konusu anlatılmaktadır.
OpenCL kütüphanesine Java platformundan JNI ile erişeceğiz. Bunun için OpenCL platformlarının listesini ve cihaz bilgilerini almak için PlatformInfo ve DeviceInfo sınıflarını kullanacağız:
PlatformInfo.java
DeviceInfo.java
OpenCL ile bağlantıyı aşağıda verilen OpenCLUtility sınıfı ile kuracağız:
JNI için gerekli C++ başlık dosyasını aşağıdaki komutu kullanarak üretiyoruz:
PlatformInfo.java
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | package com.example.opencl.utility; import java.io.Serializable; /** * * @author Binnur Kurt (binnur.kurt@gmail.com) */ public class PlatformInfo implements Serializable { private String platformVendor; private int numberOfDevices; private DeviceInfo[] deviceInfos; public PlatformInfo(String platformVendor, int numberOfDevices) { this.platformVendor = platformVendor; this.numberOfDevices = numberOfDevices; deviceInfos = new DeviceInfo[numberOfDevices]; } public String getPlatformVendor() { return platformVendor; } public int getNumberOfDevices() { return numberOfDevices; } public DeviceInfo getDeviceInfo(int deviceNo) { return deviceInfos[deviceNo]; } public void setDeviceInfo(int deviceNo, DeviceInfo deviceInfo) { this.deviceInfos[deviceNo] = deviceInfo; } public DeviceInfo[] getDeviceInfos() { return deviceInfos; } @Override public String toString() { return "PlatformInfo{" + "platformVendor=" + platformVendor + ", numberOfDevices=" + numberOfDevices + '}'; } } |
2 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | package com.example.opencl.utility; import java.io.Serializable; /** * * @author Binnur Kurt (binnur.kurt@gmail.com) */ public class DeviceInfo implements Serializable { private int addressSize; private String extensions; private String deviceName; public DeviceInfo(int addressSize, String extensions, String deviceName) { this.addressSize = addressSize; this.extensions = extensions; this.deviceName = deviceName; } public int getAddressSize() { return addressSize; } public void setAddressSize(int addressSize) { this.addressSize = addressSize; } public String getExtensions() { return extensions; } public void setExtensions(String extensions) { this.extensions = extensions; } public String getDeviceName() { return deviceName; } public void setDeviceName(String deviceName) { this.deviceName = deviceName; } @Override public String toString() { return "DeviceInfo{" + "deviceName=" + deviceName + ", addressSize=" + addressSize + ", \nextensions=" + extensions + "\n}"; } } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | package com.example.opencl.utility; import com.example.domain.Element; /** * * @author Binnur Kurt (binnur.kurt@gmail.com) */ public class OpenCLUtility { public native PlatformInfo[] getPlatformInfo(); public native void execute(Element[] array); static { System.loadLibrary("myopencllib"); } } |
javah -jni -o OpenCLUtility.h com.example.opencl.utility.OpenCLUtility
OpenCLUtility.h1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | #include <jni.h> /* Header for class com_example_opencl_utility_OpenCLUtility */ #ifndef _Included_com_example_opencl_utility_OpenCLUtility #define _Included_com_example_opencl_utility_OpenCLUtility #ifdef __cplusplus extern "C" { #endif /* * Class: com_example_opencl_utility_OpenCLUtility * Method: getPlatformInfo * Signature: ()[Lcom/example/opencl/utility/PlatformInfo; */ JNIEXPORT jobjectArray JNICALL Java_com_example_opencl_utility_OpenCLUtility_getPlatformInfo (JNIEnv *, jobject); /* * Class: com_example_opencl_utility_OpenCLUtility * Method: execute * Signature: ([Lcom/example/domain/Element;)V */ JNIEXPORT void JNICALL Java_com_example_opencl_utility_OpenCLUtility_execute (JNIEnv *, jobject, jobjectArray); #ifdef __cplusplus } #endif #endif |
OpenCLUtility.cpp:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 | #include "OpenCLUtility.h" #include <stdio.h> #include <string.h> #include <stdlib.h> #include <sys/types.h> #include <CL/cl.h> #define LOCAL_SIZE 512 cl_mem g_inputBuffer = NULL; cl_context g_context = NULL; cl_command_queue g_cmd_queue = NULL; cl_program g_program = NULL; cl_kernel g_kernel = NULL; #include "OpenCLUtil.h" //for perf. counters #include <Windows.h> LARGE_INTEGER g_PerfFrequency; LARGE_INTEGER g_PerformanceCountNDRangeStart; LARGE_INTEGER g_PerformanceCountNDRangeStop; struct JobInfo { char* companyName; char* companyAddress; char* telNumber; char* faxNumber; } ; typedef struct JobInfo TJobInfo; typedef struct JobInfo* PJobInfo; struct Element { int id; int identityCardNo; char* name; char* surname; int age; TJobInfo jobInfo; char* email; bool selected; bool masked; } ; typedef struct Element TElement; typedef struct Element* PElement; JNIEXPORT jobjectArray JNICALL Java_com_example_opencl_utility_OpenCLUtility_getPlatformInfo (JNIEnv *env, jobject obj){ /* Host/device data structures */ cl_platform_id *platforms; cl_uint numberOfPlatformsAvailable; cl_device_id *devices; cl_uint num_devices, addr_data; cl_int i,j, err; /* Extension data */ char name_data[48], ext_data[4096]; char platform_vendor[2][40]; /* Identify a platform */ err = clGetPlatformIDs(10, NULL, &numberOfPlatformsAvailable); if (err < 0) { perror("Error: Couldn't find any platforms"); exit(1); } platforms= (cl_platform_id*)malloc(sizeof(cl_platform_id)*numberOfPlatformsAvailable); err = clGetPlatformIDs(numberOfPlatformsAvailable, platforms, NULL); if (err < 0) { perror("Couldn't find any platforms"); exit(1); } printf("There are %d platforms available in the system.\n",numberOfPlatformsAvailable); /* Create the result array of PlatformInfo */ jclass platformInfoClass= env->FindClass("com/example/opencl/utility/PlatformInfo"); jclass deviceInfoClass= env->FindClass("com/example/opencl/utility/DeviceInfo"); jmethodID platformInfoClassConstructorId= env->GetMethodID(platformInfoClass,"<init>","(Ljava/lang/String;I)V"); jmethodID deviceInfoClassConstructorId= env->GetMethodID(deviceInfoClass,"<init>","(ILjava/lang/String;Ljava/lang/String;)V"); jmethodID platformInfoSetDeviceInfoId= env->GetMethodID(platformInfoClass,"setDeviceInfo","(ILcom/example/opencl/utility/DeviceInfo;)V"); //jobjectArray result= env->NewObjectArray(platformInfoClass,platformInfoClassConstructorId,vendorName,numberOfPlatformsAvailable); jobjectArray result= env->NewObjectArray(numberOfPlatformsAvailable,platformInfoClass,NULL); for (j=0;j<numberOfPlatformsAvailable;++j){ /* Determine number of connected devices */ err = clGetDeviceIDs(platforms[j], CL_DEVICE_TYPE_ALL, 1, NULL, &num_devices); if(err < 0) { perror("Error: Couldn't find any devices."); exit(1); } clGetPlatformInfo(platforms[j],CL_PLATFORM_VENDOR,sizeof(platform_vendor),&platform_vendor[j],NULL); // Access connected devices devices = (cl_device_id*) malloc(sizeof(cl_device_id) * num_devices); clGetDeviceIDs(platforms[j], CL_DEVICE_TYPE_ALL,num_devices, devices, NULL); // Create PlatformInfo and set to the result array jstring vendorName= env->NewStringUTF(platform_vendor[j]); jobject platformObject= env->NewObject(platformInfoClass,platformInfoClassConstructorId,vendorName,num_devices); env->SetObjectArrayElement(result,j,platformObject); // Obtain data for each connected device for (i=0; i<num_devices; i++) { err = clGetDeviceInfo(devices[i], CL_DEVICE_NAME,sizeof(name_data), name_data, NULL); if (err < 0) { perror("Error: Couldn't read extension data."); exit(1); } clGetDeviceInfo(devices[i], CL_DEVICE_ADDRESS_BITS,sizeof(ext_data), &addr_data, NULL); clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(ext_data), ext_data, NULL); // create device info and set it to the platform jstring extensions= env->NewStringUTF(ext_data); jstring deviceName= env->NewStringUTF(name_data); jobject deviceObject= env->NewObject(deviceInfoClass,deviceInfoClassConstructorId,addr_data,extensions,deviceName); env->CallDoubleMethod(platformObject,platformInfoSetDeviceInfoId,i,deviceObject); } free(devices); } // free data structures free(platforms); return result; } char* jstringToCString(JNIEnv *env,jstring msg){ const char* cstringTemp; cstringTemp= env->GetStringUTFChars(msg,NULL); int len= env->GetStringLength(msg)+1; char *resCString= (char*)malloc(sizeof(char)*len); strcpy(resCString,cstringTemp); *(resCString+len-1)='\0'; env->ReleaseStringUTFChars(msg,cstringTemp); return resCString; } char* jstringToCStringV2(JNIEnv *env,jstring msg){ return const_cast<char*>(env->GetStringUTFChars(msg,NULL)); } void JniConvertToNativeArrayToElementArray(JNIEnv *env, jobject *obj, PElement nativeArray,jobjectArray *arr){ int arraySize= env->GetArrayLength(*arr); jclass elementClass= env->FindClass("com/example/domain/Element"); jclass javaUtilListClass= env->FindClass ("Ljava/util/ArrayList;"); jclass jobInfoClass= env->FindClass("com/example/model/JobInfo"); // Field IDs jfieldID idFieldID= env->GetFieldID(elementClass,"id","I"); jfieldID ageFieldID= env->GetFieldID(elementClass,"age","I"); jfieldID identityCardNoFieldID= env->GetFieldID(elementClass,"identityCardNo","I"); for (int i=0;i<arraySize;++i){ jobject jniArrayElement= env->GetObjectArrayElement(*arr,i); env->SetIntField(jniArrayElement,ageFieldID,nativeArray[i].age); env->SetIntField(jniArrayElement,identityCardNoFieldID,nativeArray[i].identityCardNo); env->SetIntField(jniArrayElement,idFieldID,nativeArray[i].id); } } PElement JniConvertElementArrayToNative(JNIEnv *env, jobject *obj, jobjectArray *arr){ int arraySize= env->GetArrayLength(*arr); PElement nativeArray= (PElement) malloc(sizeof(TElement) * arraySize); jclass elementClass= env->FindClass("com/example/domain/Element"); jclass javaUtilListClass= env->FindClass ("Ljava/util/ArrayList;"); jclass jobInfoClass= env->FindClass("com/example/model/JobInfo"); // Field IDs jfieldID idFieldID= env->GetFieldID(elementClass,"id","I"); jfieldID ageFieldID= env->GetFieldID(elementClass,"age","I"); jfieldID nameFieldID= env->GetFieldID(elementClass,"name","Ljava/lang/String;"); jfieldID surnameFieldID= env->GetFieldID(elementClass,"surname","Ljava/lang/String;"); jfieldID emailFieldID= env->GetFieldID(elementClass,"email","Ljava/lang/String;"); jfieldID identityCardNoFieldID= env->GetFieldID(elementClass,"identityCardNo","I"); jfieldID jobInfoFieldID= env->GetFieldID(elementClass,"jobInfo","Lcom/example/model/JobInfo;"); jfieldID companyNameFieldID= env->GetFieldID(jobInfoClass,"companyName","Ljava/lang/String;"); jfieldID companyAddressFieldID= env->GetFieldID(jobInfoClass,"companyAddress","Ljava/lang/String;"); jfieldID telNumberFieldID= env->GetFieldID(jobInfoClass,"telNumber","Ljava/lang/String;"); jfieldID faxNumberFieldID= env->GetFieldID(jobInfoClass,"faxNumber","Ljava/lang/String;"); // Local fields jstring jstringName; jstring jstringSurname; jstring jstringEmail; jstring jstringCompanyName; jstring jstringCompanyAddress; jobject jobInfo; jstring telNumber; jstring faxNumber; int listSize; for (int i=0;i<arraySize;++i){ jobject jniArrayElement= env->GetObjectArrayElement(*arr,i); nativeArray[i].age= (jint) env->GetIntField(jniArrayElement,ageFieldID); nativeArray[i].id= (jint) env->GetIntField(jniArrayElement,idFieldID); nativeArray[i].identityCardNo= (jint) env->GetIntField(jniArrayElement,identityCardNoFieldID); jstringName= (jstring) env->GetObjectField(jniArrayElement,nameFieldID); jstringSurname= (jstring) env->GetObjectField(jniArrayElement,surnameFieldID); jstringEmail= (jstring) env->GetObjectField(jniArrayElement,emailFieldID); jobInfo= env->GetObjectField(jniArrayElement,jobInfoFieldID); nativeArray[i].name= jstringToCString(env,jstringName); nativeArray[i].surname= jstringToCString(env,jstringSurname); nativeArray[i].email= jstringToCString(env,jstringEmail); // .jobInfo jobInfo= env->GetObjectField(jniArrayElement,jobInfoFieldID); // .companyName jstringCompanyName= (jstring) env->GetObjectField(jobInfo,companyNameFieldID); nativeArray[i].jobInfo.companyName = jstringToCStringV2(env,jstringCompanyName); // .companyName jstringCompanyAddress= (jstring) env->GetObjectField(jobInfo,companyAddressFieldID); nativeArray[i].jobInfo.companyAddress = jstringToCStringV2(env,jstringCompanyAddress); // telNumber telNumber= (jstring) env->GetObjectField(jobInfo,telNumberFieldID); nativeArray[i].jobInfo.telNumber = jstringToCStringV2(env,telNumber); // faxNumber faxNumber= (jstring) env->GetObjectField(jobInfo,faxNumberFieldID); nativeArray[i].jobInfo.faxNumber = jstringToCStringV2(env,faxNumber); } return nativeArray; } JNIEXPORT void JNICALL Java_com_example_opencl_utility_OpenCLUtility_execute(JNIEnv *env, jobject obj, jobjectArray arr){ cl_int err = CL_SUCCESS; int arraySize= env->GetArrayLength(arr); arraySize=arraySize-arraySize%8; printf("INFO: Recieved array of size %d\n",arraySize); PElement inputArray= JniConvertElementArrayToNative(env,&obj,&arr); if(Setup_OpenCL("SampleFilter.cl")==-1){ printf("ERROR: Failed to Setup OpenCL...\n"); return; } // allocate the buffer g_inputBuffer = clCreateBuffer(g_context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, sizeof(TListElement) * arraySize, inputArray, NULL); if (g_inputBuffer == (cl_mem)0){ printf("ERROR: Failed to create Input Buffer...\n"); return; } printf("INFO: g_inputBuffer is created (%x)...\n",g_inputBuffer); err = clSetKernelArg(g_kernel, 0, sizeof(cl_mem), (void *) &g_inputBuffer); if (err != CL_SUCCESS) { printf("ERROR: Failed to set input g_kernel arguments...\n"); return; } printf("INFO: g_kernel parameters are set...\n"); size_t global_work_size[1] = { arraySize }; size_t local_work_size[1]= { LOCAL_SIZE }; size_t offset[1]= { 0 }; QueryPerformanceCounter(&g_PerformanceCountNDRangeStart); // execute kernel if (CL_SUCCESS != (err = clEnqueueNDRangeKernel(g_cmd_queue, g_kernel, 1, offset, global_work_size, local_work_size, 0, NULL, NULL))){ printf("ERROR: Failed to run kernel...\n"); return; } err = clFinish(g_cmd_queue); QueryPerformanceCounter(&g_PerformanceCountNDRangeStop); void* tmp_ptr = NULL; tmp_ptr = clEnqueueMapBuffer(g_cmd_queue, g_inputBuffer, true, CL_MAP_READ, 0, sizeof(struct ListElement) * arraySize, 0, NULL, NULL, NULL); if (tmp_ptr!=inputArray){ printf("ERROR: clEnqueueMapBuffer failed to return original pointer\n"); return; } err = clFinish(g_cmd_queue); clEnqueueUnmapMemObject(g_cmd_queue, g_inputBuffer, tmp_ptr, 0, NULL, NULL); clReleaseMemObject(g_inputBuffer); //retrieve perf. counter frequency QueryPerformanceFrequency(&g_PerfFrequency); printf("NDRange perf. counter time %f ms.\n", 1000.0f*(float)(g_PerformanceCountNDRangeStop.QuadPart - g_PerformanceCountNDRangeStart.QuadPart)/(float)g_PerfFrequency.QuadPart); JniConvertToNativeArrayToElementArray(env,&obj,inputArray,&arr); free(inputArray); g_inputBuffer = NULL; } |
compile.bat:
cl -EHsc "-I%JAVA_HOME%\include" "-I%JAVA_HOME%\include\win32" "-I%INTELOCLSDKROOT%\include" -LD %1 OpenCL.lib -Fe%2 /link /LIBPATH:"%INTELOCLSDKROOT%\lib\x86" OpenCL.lib
build_dll.bat:
call setenv.bat
IF NOT EXIST myopencllib.dll call compile.bat OpenCLUtility.cpp myopencllib.dll
cmd> build_dll.bat
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
OpenCLUtility.cpp
Microsoft (R) Incremental Linker Version 9.00.21022.08
Copyright (C) Microsoft Corporation. All rights reserved.
/dll
/implib:myopencllib.lib
/out:myopencllib.dll
"/LIBPATH:C:\Program Files (x86)\Intel\OpenCL SDK\2.0\lib\x86"
OpenCL.lib
OpenCLUtility.obj
OpenCL.lib
Creating library myopencllib.lib and object myopencllib.exp
Çekirdek fonksiyonu olarak aşağıdaki SampleFilter.cl dosyası içindeki __kernel void SampleFilter(__global struct Element* pSrc) fonksiyonunu kullancağız:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | struct JobInfo { char* companyName; char* companyAddress; char* telNumber; char* faxNumber; } ; typedef struct JobInfo TJobInfo; struct Element { int id; int identityCardNo; char* name; char* surname; int age; TJobInfo jobInfo; char* email; bool selected; bool masked; } ; __kernel void SampleFilter(__global struct Element* pSrc) { const int x = get_global_id(0); const int size = get_local_size(0); pSrc[x].id=pSrc[x].id+5; pSrc[x].identityCardNo=pSrc[x].identityCardNo+1000; pSrc[x].age=pSrc[x].age+10; } |