Mohsen Ghasempour, Ph.D.

Greater Manchester, England, United Kingdom Contact Info
1K followers 500+ connections

Join to view profile

Articles by Mohsen

Activity

Join now to see all activity

Experience & Education

  • Kingfisher plc

View Mohsen’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Licenses & Certifications

Publications

  • Analysis of FPGA and Software Approaches to Simulate Unconventional Computer Architectures

    International Conference onReConFigurable Computing and FPGAs (ReConFig)

    The design of new computer architectures relies heavily on simulation. New architectures that incorporate uncon- ventional features or novel designs cannot usually use established simulators and, therefore, designers have to develop their own.
    Traditionally, software simulators have been the main platform for architectural design, based on the conventional wisdom that software is flexible and easy to program, albeit slow, while hardware is fast but difficult to develop. The introduction of…

    The design of new computer architectures relies heavily on simulation. New architectures that incorporate uncon- ventional features or novel designs cannot usually use established simulators and, therefore, designers have to develop their own.
    Traditionally, software simulators have been the main platform for architectural design, based on the conventional wisdom that software is flexible and easy to program, albeit slow, while hardware is fast but difficult to develop. The introduction of high-level hardware description languages (HDLs), such as Bluespec, together with improvements in FPGAs, provide an opportunity to challenge the traditional notion and consider hardware simulators for this purpose.
    This paper presents a comprehensive analysis of the perfor- mance and the implementation effort of two simulators, one FPGA based and one software based, developed to simulate a novel, unconventional architecture. The analysis uses the interconnection network of the SpiNNaker massively-parallel computer as a case study which allows a comparison with the real system.

    Other authors
  • DReAM: Dynamic Re-arrangement of Address Mapping to Improve the Performance of DRAMs

    The International Symposium on Memory Systems (MEMSYS)

    The initial location of data in DRAMs is determined and controlled by the ‘address-mapping’ and even modern mem- ory controllers use a fixed and run-time-agnostic address mapping. On the other hand, the memory access pattern seen at the memory interface level will dynamically change at run-time. This dynamic nature of memory access pat- tern and the fixed behavior of address mapping process in DRAM controllers, implied by using a fixed address map- ping scheme, means that DRAM performance…

    The initial location of data in DRAMs is determined and controlled by the ‘address-mapping’ and even modern mem- ory controllers use a fixed and run-time-agnostic address mapping. On the other hand, the memory access pattern seen at the memory interface level will dynamically change at run-time. This dynamic nature of memory access pat- tern and the fixed behavior of address mapping process in DRAM controllers, implied by using a fixed address map- ping scheme, means that DRAM performance cannot be ex- ploited efficiently.
    DReAM is a novel hardware technique that can detect a workload-specific address mapping at run-time based on the application access pattern which improves the performance of DRAMs. The experimental results show that DReAM outperforms the best evaluated address mapping on average by 9%, for mapping-sensitive workloads, by 2% for mapping- insensitive workloads, and up to 28% across all the work- loads. DReAM can be seen as an insurance policy capable of detecting which scenarios are not well served by the pre- defined address mapping.

    Other authors
  • HAPPY: Hybrid Address-based Page Policy in DRAMs

    The International Symposium on Memory Systems (MEMSYS)

    Memory controllers have used static page closure policies to decide whether a row should be left open, open-page pol- icy, or closed immediately, close-page policy, after the row has been accessed. The appropriate choice for a particular access can reduce the average memory latency. However, since application access patterns change at run time, static page policies cannot guarantee to deliver optimum execu- tion time. Hybrid page policies have been investigated as a means of covering these…

    Memory controllers have used static page closure policies to decide whether a row should be left open, open-page pol- icy, or closed immediately, close-page policy, after the row has been accessed. The appropriate choice for a particular access can reduce the average memory latency. However, since application access patterns change at run time, static page policies cannot guarantee to deliver optimum execu- tion time. Hybrid page policies have been investigated as a means of covering these dynamic scenarios and are now im- plemented in state-of-the-art processors. Hybrid page poli- cies switch between open-page and close-page policies while the application is running, by monitoring the access pattern of row hits/conflicts and predicting future behavior. Unfor- tunately, as the size of DRAM memory increases, fine-grain tracking and analysis of memory access patterns does not remain practical.
    We propose a compact memory address-based encoding technique which can improve or maintain the performance of DRAMs page closure predictors while reducing the hardware overhead in comparison with state-of-the-art techniques. As a case study, we integrate our technique, HAPPY, with a state-of-the-art Intel-adaptive monitor (e.g. part of the Intel Xeon X5650) and a traditional Hybrid page policy. We eval- uate them across 70 memory intensive workload mixes con- sisting of single-thread and multi-thread applications. The experimental results show that using the HAPPY encoding applied to the Intel-adaptive page closure policy can reduce the hardware overhead by 5× for the evaluated 64 GB mem- ory (up to 40× for a 512 GB memory) while maintaining the prediction accuracy.

    Other authors
  • Accelerating Interconnect Analysis using High-Level HDLs and FPGA, SpiNNaker as a Case Study

    The 23rd IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM)

    Architectural simulation is a fundamental tool for modern computing system design. Computer architects can choose from a large set of software simulators that provide a robust and efficient platform for design exploration although, if the new architecture incorporates unconventional or novel features not supported by the existing simulators, the de- signers must develop their own. As reconfigurable hardware platforms grow more computationally capable we observe a move from software simulators…

    Architectural simulation is a fundamental tool for modern computing system design. Computer architects can choose from a large set of software simulators that provide a robust and efficient platform for design exploration although, if the new architecture incorporates unconventional or novel features not supported by the existing simulators, the de- signers must develop their own. As reconfigurable hardware platforms grow more computationally capable we observe a move from software simulators towards these hardware platforms. The introduction of high-level HDLs, such as Bluespec System Verilog (BSV), offers improved produc- tivity while still providing a tool flow capable of exploiting reconfigurable platforms. This paper focuses on understand- ing how to accelerate the simulation of the interconnection network of SpiNNaker [1], a massively-parallel computer for neural simulation. We analysed the modelling choices and trade-offs made during the implementation of the software (SW) model as well as those made when developing a new hardware model (HW) built on a Xilinx FPGA.

    Other authors
  • An empirical evaluation of High-Level Synthesis languages and tools for database acceleration

    IEEE International Conference on Field Programmable Logic and Applications (FPL)

    High Level Synthesis (HLS) languages and tools are emerging as the most
    promising technique to make FPGAs more accessible to software developers. Nevertheless,
    picking the most suitable HLS for a certain class of algorithms depends on requirements
    such as area and throughput, as well as on programmer experience.

    Other authors
    See publication
  • SoC Simulator on FPGA using Bluespec System Verilog

    UK Electronics Forum (UKEF)

    Building large computing systems requires first to model them. Modern hardware systems are so complex that their software models in the desired detail may be too slow. Thus abstract hardware modelling can be appropriate. This paper presents an example software/hardware model built using Bluespec System Verilog (BSV) design flow to give rapid simulation of a hardware system. The chosen example was a hardware model of the on-chip router, on-chip and off-chip network of SpiNNaker for understanding…

    Building large computing systems requires first to model them. Modern hardware systems are so complex that their software models in the desired detail may be too slow. Thus abstract hardware modelling can be appropriate. This paper presents an example software/hardware model built using Bluespec System Verilog (BSV) design flow to give rapid simulation of a hardware system. The chosen example was a hardware model of the on-chip router, on-chip and off-chip network of SpiNNaker for understanding the behaviour of the traffic in the system. A model of a 5×5 SpiNNaker topology has been designed in Virtex-5 FPGA using BSV and a Graphical User Interface (GUI) was developed in LabVIEW for graphical representation of the results.

    Other authors
  • Ultra-Low Power Transmitter

    IEEE International Symposium on Circuits and Systems (ISCAS)

    This paper presents a design of an ultra-low power UWB transmitter based on 4th and 5th derivative Gaussian pulse shapes implemented in UMC 90nm CMOS technology. The simulations show 119mV peak to peak pulse amplitude and the pulse width of 240 ps for the 5th derivative Gaussian pulse and 99.71mV pulse amplitude and 190 ps pulse width for the 4th derivative Gaussian pulse. Power consumption of the pulse generators are calculated 30.11 uW and 21.5 uW for the 5th and 4th derivative Gaussian pulse…

    This paper presents a design of an ultra-low power UWB transmitter based on 4th and 5th derivative Gaussian pulse shapes implemented in UMC 90nm CMOS technology. The simulations show 119mV peak to peak pulse amplitude and the pulse width of 240 ps for the 5th derivative Gaussian pulse and 99.71mV pulse amplitude and 190 ps pulse width for the 4th derivative Gaussian pulse. Power consumption of the pulse generators are calculated 30.11 uW and 21.5 uW for the 5th and 4th derivative Gaussian pulse respectively at a 100MHz pulse repeating frequency (PRF). Ultra-low power radio transmission is important in such application contexts as wireless network nodes and sensors powered by energy harvesters.

    Other authors

Patents

  • MONITORING DEVICE

    Filed US PCT/EP2016/050385

    This patent is an outcome of ARMOR project. ARMOR has received funding from The University of Manchester Intellectual Property (UMIP) and is currently under process of commercialisation. An FPGA-based prototype of ARMOR is currently under production and will be available soon. For more information please visit ARMOR website.
    Website: http://apt.cs.manchester.ac.uk/projects/ARMOR/RowHammer/index.html

    Other inventors
    See patent

Courses

  • Hardware support for trustworthy systems by Ted Huffmire

    -

  • PostgreSQL Advanced Development & Performance

    -

  • PostgreSQL Database Administration

    -

  • PostgreSQL Development

    -

  • Processor microarchitecture by Antonio Gonzalez

    -

  • Reliability: the next frontier for systematic benchmarking and monitoring tools by Murali Annavaram

    -

  • Resource management in reconfigurable computing systems by Katherine Compton

    -

Projects

  • ARMOR (A Run-time Memory hot-row detectOR)

    ARMOR is a hardware solution to prevent Row Hammer Error in DRAMs. Row hammering in DRAMs can occur when a specific wordline of a DRAM cell is activated repeatedly within a refresh interval. In this situation the neighboring cells leak charge at a faster rate than expected. Thus, the retention time of such cells becomes less than refresh cycle (e.g. 64 ms) which means that these cells may lose their data (charge) before the refresh happens.

    Other creators
    See project
  • AXLE (Advanced Analytics for Extremely Large European Databases)

    The objectives of the AXLE project are to greatly improve the speed and quality of decision making on real-world data sets.
    AXLE is aimed for use on databases with these characteristics:
    Important
    Highly Secure
    Complex
    Standardised/Widely Used
    Extremely Large

    Other creators
    See project

Honors & Awards

  • Full PhD Sponsorship

    University of Manchester

Languages

  • English

    Professional working proficiency

  • Persian

    Native or bilingual proficiency

More activity by Mohsen

View Mohsen’s full profile

  • See who you know in common
  • Get introduced
  • Contact Mohsen directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses